[PATCH] KVM: x86: Work around buggy MPX platform
From 5854070994c5002b3a37577165ed3e82f36f712d Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Sat, 8 Mar 2014 04:40:02 +0800 Subject: [PATCH] KVM: x86: Work around buggy MPX platform Work around buggy MPX platform which support MSR_IA32_BNDCFGS but has issue at, say, VMX ucode. Signed-off-by: Liu Jinsong jinsong@intel.com --- arch/x86/kvm/x86.c | 14 +- 1 files changed, 13 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 1e91a24..1fc184d 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3936,6 +3936,16 @@ static void kvm_init_msr_list(void) for (i = j = KVM_SAVE_MSRS_BEGIN; i ARRAY_SIZE(msrs_to_save); i++) { if (rdmsr_safe(msrs_to_save[i], dummy[0], dummy[1]) 0) continue; + + /* +* Work around some buggy MPX platform which support +* MSR_IA32_BNDCFGS but has issue at, say, VMX ucode. +*/ + if ((msrs_to_save[i] == MSR_IA32_BNDCFGS) + (kvm_x86_ops-mpx_supported ? + !kvm_x86_ops-mpx_supported() : 1)) + continue; + if (j i) msrs_to_save[j] = msrs_to_save[i]; j++; @@ -5576,9 +5586,11 @@ int kvm_arch_init(void *opaque) goto out_free_percpu; kvm_set_mmio_spte_mask(); - kvm_init_msr_list(); kvm_x86_ops = ops; + + kvm_init_msr_list(); + kvm_mmu_set_mask_ptes(PT_USER_MASK, PT_ACCESSED_MASK, PT_DIRTY_MASK, PT64_NX_MASK, 0); -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] target-i386: bugfix of Intel MPX
From 3a7783cd9a0556787809d3d5ecb5f2b85dd9fc02 Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Mon, 3 Mar 2014 18:56:39 +0800 Subject: [PATCH] target-i386: bugfix of Intel MPX The correct size of cpuid 0x0d sub-leaf 4 is 0x40, not 0x10. This is confirmed by Anvin H Peter and Mallick Asit K. Signed-off-by: Liu Jinsong jinsong@intel.com Cc: H. Peter Anvin h...@zytor.com Cc: Asit K Mallick asit.k.mall...@intel.com --- target-i386/cpu.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/target-i386/cpu.c b/target-i386/cpu.c index 0e8812a..9f69d7e 100644 --- a/target-i386/cpu.c +++ b/target-i386/cpu.c @@ -339,7 +339,7 @@ static const ExtSaveArea ext_save_areas[] = { [3] = { .feature = FEAT_7_0_EBX, .bits = CPUID_7_0_EBX_MPX, .offset = 0x3c0, .size = 0x40 }, [4] = { .feature = FEAT_7_0_EBX, .bits = CPUID_7_0_EBX_MPX, -.offset = 0x400, .size = 0x10 }, +.offset = 0x400, .size = 0x40 }, }; const char *get_register_name_32(unsigned int reg) -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH v4 3/3] KVM: x86: Enable Intel MPX for guest
Liu, Jinsong wrote: Paolo Bonzini wrote: Il 21/02/2014 18:57, Liu, Jinsong ha scritto: - F(BMI2) | F(ERMS) | f_invpcid | F(RTM) | F(RDSEED) | + F(BMI2) | F(ERMS) | f_invpcid | F(RTM) | F(MPX) | F(RDSEED) | F(ADX); MPX also needs to be conditional on mpx_supported here, like it is done with f_rdtscp for example. Paolo Yes, has updated and sent out. Thanks, Jinsong Seems some issues when I send via git send-email. Re-send it under Windows, please ignore if you receive PATCH v5 twice. Thanks, Jinsong-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v5 0/3] KVM: x86: enable Intel MPX for KVM
These patches are version 5 to enable Intel MPX for KVM. Version 1: * Add some Intel MPX definiation * Fix a cpuid(0x0d, 0) exposing bug, dynamic per XCR0 features enable/disable * vmx and msr handle for MPX support at KVM * enalbe MPX feature for guest Version 2: * remove generic MPX definiation, Qiaowei's patch has add the definiation at kernel side * add MSR_IA32_BNDCFGS to msrs_to_save Version 3: * rebase on latest kernel, which include Qiaowei's MPX common definiation pulled from HPA's tree Version 4: * Remove xsave bugfix patch from this series as a standalone patch * Add a new kvm_x86_ops member mpx_supported, to disable MPX whenever the two VMX controls are not available * minor rebase for VMX bit definition Version 5: * Add conditional mpx_supported when expose MPX to guest Thanks, Jinsong Liu Jinsong (3): KVM: x86: Intel MPX vmx and msr handle KVM: x86: add MSR_IA32_BNDCFGS to msrs_to_save KVM: x86: Enable Intel MPX for guest arch/x86/include/asm/kvm_host.h |1 + arch/x86/include/asm/vmx.h|4 arch/x86/include/uapi/asm/msr-index.h |1 + arch/x86/kvm/cpuid.c |4 +++- arch/x86/kvm/vmx.c| 24 ++-- arch/x86/kvm/x86.c|8 +++- arch/x86/kvm/x86.h|3 ++- 7 files changed, 40 insertions(+), 5 deletions(-) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v5 1/3] KVM: x86: Intel MPX vmx and msr handle
From caddc009a6d2019034af8f2346b2fd37a81608d0 Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Mon, 24 Feb 2014 18:11:11 +0800 Subject: [PATCH v5 1/3] KVM: x86: Intel MPX vmx and msr handle This patch handle vmx and msr of Intel MPX feature. Signed-off-by: Xudong Hao xudong@intel.com Signed-off-by: Liu Jinsong jinsong@intel.com --- arch/x86/include/asm/kvm_host.h |1 + arch/x86/include/asm/vmx.h|4 arch/x86/include/uapi/asm/msr-index.h |1 + arch/x86/kvm/vmx.c| 18 -- 4 files changed, 22 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index fdf83af..1c32ca3 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -765,6 +765,7 @@ struct kvm_x86_ops { struct x86_instruction_info *info, enum x86_intercept_stage stage); void (*handle_external_intr)(struct kvm_vcpu *vcpu); + bool (*mpx_supported)(void); }; struct kvm_arch_async_pf { diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h index 2067264..7004d21 100644 --- a/arch/x86/include/asm/vmx.h +++ b/arch/x86/include/asm/vmx.h @@ -85,6 +85,7 @@ #define VM_EXIT_SAVE_IA32_EFER 0x0010 #define VM_EXIT_LOAD_IA32_EFER 0x0020 #define VM_EXIT_SAVE_VMX_PREEMPTION_TIMER 0x0040 +#define VM_EXIT_CLEAR_BNDCFGS 0x0080 #define VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR 0x00036dff @@ -95,6 +96,7 @@ #define VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL 0x2000 #define VM_ENTRY_LOAD_IA32_PAT 0x4000 #define VM_ENTRY_LOAD_IA32_EFER 0x8000 +#define VM_ENTRY_LOAD_BNDCFGS 0x0001 #define VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR 0x11ff @@ -174,6 +176,8 @@ enum vmcs_field { GUEST_PDPTR2_HIGH = 0x280f, GUEST_PDPTR3= 0x2810, GUEST_PDPTR3_HIGH = 0x2811, + GUEST_BNDCFGS = 0x2812, + GUEST_BNDCFGS_HIGH = 0x2813, HOST_IA32_PAT = 0x2c00, HOST_IA32_PAT_HIGH = 0x2c01, HOST_IA32_EFER = 0x2c02, diff --git a/arch/x86/include/uapi/asm/msr-index.h b/arch/x86/include/uapi/asm/msr-index.h index c19fc60..ed821ed 100644 --- a/arch/x86/include/uapi/asm/msr-index.h +++ b/arch/x86/include/uapi/asm/msr-index.h @@ -295,6 +295,7 @@ #define MSR_SMI_COUNT 0x0034 #define MSR_IA32_FEATURE_CONTROL0x003a #define MSR_IA32_TSC_ADJUST 0x003b +#define MSR_IA32_BNDCFGS 0x0d90 #define FEATURE_CONTROL_LOCKED (10) #define FEATURE_CONTROL_VMXON_ENABLED_INSIDE_SMX (11) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index a06f101..e4e4b50 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -441,6 +441,7 @@ struct vcpu_vmx { #endif int gs_ldt_reload_needed; int fs_reload_needed; + u64 msr_host_bndcfgs; } host_state; struct { int vm86_active; @@ -1710,6 +1711,8 @@ static void vmx_save_host_state(struct kvm_vcpu *vcpu) if (is_long_mode(vmx-vcpu)) wrmsrl(MSR_KERNEL_GS_BASE, vmx-msr_guest_kernel_gs_base); #endif + if (boot_cpu_has(X86_FEATURE_MPX)) + rdmsrl(MSR_IA32_BNDCFGS, vmx-host_state.msr_host_bndcfgs); for (i = 0; i vmx-save_nmsrs; ++i) kvm_set_shared_msr(vmx-guest_msrs[i].index, vmx-guest_msrs[i].data, @@ -1747,6 +1750,8 @@ static void __vmx_load_host_state(struct vcpu_vmx *vmx) #ifdef CONFIG_X86_64 wrmsrl(MSR_KERNEL_GS_BASE, vmx-msr_host_kernel_gs_base); #endif + if (vmx-host_state.msr_host_bndcfgs) + wrmsrl(MSR_IA32_BNDCFGS, vmx-host_state.msr_host_bndcfgs); /* * If the FPU is not active (through the host task or * the guest vcpu), then restore the cr0.TS bit. @@ -2837,7 +2842,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf) min |= VM_EXIT_HOST_ADDR_SPACE_SIZE; #endif opt = VM_EXIT_SAVE_IA32_PAT | VM_EXIT_LOAD_IA32_PAT | - VM_EXIT_ACK_INTR_ON_EXIT; + VM_EXIT_ACK_INTR_ON_EXIT | VM_EXIT_CLEAR_BNDCFGS; if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_EXIT_CTLS, _vmexit_control) 0) return -EIO; @@ -2854,7 +2859,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf) _pin_based_exec_control = ~PIN_BASED_POSTED_INTR; min = 0; - opt = VM_ENTRY_LOAD_IA32_PAT; + opt = VM_ENTRY_LOAD_IA32_PAT | VM_ENTRY_LOAD_BNDCFGS
[PATCH v5 2/3] KVM: x86: add MSR_IA32_BNDCFGS to msrs_to_save
From 5d5a80cd172ea6fb51786369bcc23356b1e9e956 Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Mon, 24 Feb 2014 18:11:55 +0800 Subject: [PATCH v5 2/3] KVM: x86: add MSR_IA32_BNDCFGS to msrs_to_save Add MSR_IA32_BNDCFGS to msrs_to_save, and corresponding logic to kvm_get/set_msr(). Signed-off-by: Liu Jinsong jinsong@intel.com --- arch/x86/kvm/vmx.c |6 ++ arch/x86/kvm/x86.c |2 +- 2 files changed, 7 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index e4e4b50..729b1e4 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -2484,6 +2484,9 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata) case MSR_IA32_SYSENTER_ESP: data = vmcs_readl(GUEST_SYSENTER_ESP); break; + case MSR_IA32_BNDCFGS: + data = vmcs_readl(GUEST_BNDCFGS); + break; case MSR_IA32_FEATURE_CONTROL: if (!nested_vmx_allowed(vcpu)) return 1; @@ -2552,6 +2555,9 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) case MSR_IA32_SYSENTER_ESP: vmcs_writel(GUEST_SYSENTER_ESP, data); break; + case MSR_IA32_BNDCFGS: + vmcs_writel(GUEST_BNDCFGS, data); + break; case MSR_IA32_TSC: kvm_write_tsc(vcpu, msr_info); break; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 7c52acb..89e4e27 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -882,7 +882,7 @@ static u32 msrs_to_save[] = { MSR_CSTAR, MSR_KERNEL_GS_BASE, MSR_SYSCALL_MASK, MSR_LSTAR, #endif MSR_IA32_TSC, MSR_IA32_CR_PAT, MSR_VM_HSAVE_PA, - MSR_IA32_FEATURE_CONTROL + MSR_IA32_FEATURE_CONTROL, MSR_IA32_BNDCFGS }; static unsigned num_msrs_to_save; -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v5 3/3] KVM: x86: Enable Intel MPX for guest
From 44c2abca2c2eadc6f2f752b66de4acc8131880c4 Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Mon, 24 Feb 2014 18:12:31 +0800 Subject: [PATCH v5 3/3] KVM: x86: Enable Intel MPX for guest This patch enable Intel MPX feature to guest. Signed-off-by: Xudong Hao xudong@intel.com Signed-off-by: Liu Jinsong jinsong@intel.com --- arch/x86/kvm/cpuid.c |4 +++- arch/x86/kvm/x86.c |6 ++ arch/x86/kvm/x86.h |3 ++- 3 files changed, 11 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index b241325..ddc8a7e 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -256,6 +256,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, #endif unsigned f_rdtscp = kvm_x86_ops-rdtscp_supported() ? F(RDTSCP) : 0; unsigned f_invpcid = kvm_x86_ops-invpcid_supported() ? F(INVPCID) : 0; + unsigned f_mpx = kvm_x86_ops-mpx_supported ? +(kvm_x86_ops-mpx_supported() ? F(MPX) : 0) : 0; /* cpuid 1.edx */ const u32 kvm_supported_word0_x86_features = @@ -303,7 +305,7 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* cpuid 7.0.ebx */ const u32 kvm_supported_word9_x86_features = F(FSGSBASE) | F(BMI1) | F(HLE) | F(AVX2) | F(SMEP) | - F(BMI2) | F(ERMS) | f_invpcid | F(RTM) | F(RDSEED) | + F(BMI2) | F(ERMS) | f_invpcid | F(RTM) | f_mpx | F(RDSEED) | F(ADX); /* all calls to cpuid_count() should be made on the same cpu */ diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 89e4e27..3570e71 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -599,6 +599,9 @@ int __kvm_set_xcr(struct kvm_vcpu *vcpu, u32 index, u64 xcr) u64 old_xcr0 = vcpu-arch.xcr0; u64 valid_bits; + if (!kvm_x86_ops-mpx_supported || !kvm_x86_ops-mpx_supported()) + xcr0 = ~(XSTATE_BNDREGS | XSTATE_BNDCSR); + /* Only support XCR_XFEATURE_ENABLED_MASK(xcr0) now */ if (index != XCR_XFEATURE_ENABLED_MASK) return 1; @@ -616,6 +619,9 @@ int __kvm_set_xcr(struct kvm_vcpu *vcpu, u32 index, u64 xcr) if (xcr0 ~valid_bits) return 1; + if ((!(xcr0 XSTATE_BNDREGS)) != (!(xcr0 XSTATE_BNDCSR))) + return 1; + kvm_put_guest_xcr0(vcpu); vcpu-arch.xcr0 = xcr0; diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 8da5823..392ecbf 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -122,7 +122,8 @@ int kvm_write_guest_virt_system(struct x86_emulate_ctxt *ctxt, gva_t addr, void *val, unsigned int bytes, struct x86_exception *exception); -#define KVM_SUPPORTED_XCR0 (XSTATE_FP | XSTATE_SSE | XSTATE_YMM) +#define KVM_SUPPORTED_XCR0 (XSTATE_FP | XSTATE_SSE | XSTATE_YMM \ + | XSTATE_BNDREGS | XSTATE_BNDCSR) extern u64 host_xcr0; extern unsigned int min_timer_period_us; -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH v5 3/3] KVM: x86: Enable Intel MPX for guest
Paolo Bonzini wrote: Il 24/02/2014 11:58, Liu, Jinsong ha scritto: @@ -599,6 +599,9 @@ int __kvm_set_xcr(struct kvm_vcpu *vcpu, u32 index, u64 xcr) u64 old_xcr0 = vcpu-arch.xcr0; u64 valid_bits; +if (!kvm_x86_ops-mpx_supported || !kvm_x86_ops-mpx_supported()) +xcr0 = ~(XSTATE_BNDREGS | XSTATE_BNDCSR); + /* Only support XCR_XFEATURE_ENABLED_MASK(xcr0) now */ if (index != XCR_XFEATURE_ENABLED_MASK) return 1; This hunk is incorrect, and I can simply drop it when applying. If MPX is not supported, it should not be in the 0Dh CPUID leaf and thus in vcpu-arch.guest_supported_xcr0. This however relies on userspace passing a sensible value of CPUID. I'll send a patch to strengthen the computation of guest_supported_xcr0. Thanks! Paolo So patch v5 would be applied except you will remove the incorrect hunk, and you will send a patch strengthenning guest_supported_xcr0? Thanks, Jinsong-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH v4 3/3] KVM: x86: Enable Intel MPX for guest
Paolo Bonzini wrote: Il 21/02/2014 18:57, Liu, Jinsong ha scritto: -F(BMI2) | F(ERMS) | f_invpcid | F(RTM) | F(RDSEED) | +F(BMI2) | F(ERMS) | f_invpcid | F(RTM) | F(MPX) | F(RDSEED) | F(ADX); MPX also needs to be conditional on mpx_supported here, like it is done with f_rdtscp for example. Paolo Yes, has updated and sent out. Thanks, Jinsong-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] KVM: x86: expose new instruction RDSEED to guest
From 24ffdce9efebf13c6ed4882f714b2b57ef1141eb Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Thu, 20 Feb 2014 17:38:26 +0800 Subject: [PATCH] KVM: x86: expose new instruction RDSEED to guest RDSEED instruction return a random number, which supplied by a cryptographically secure, deterministic random bit generator(DRBG). Signed-off-by: Xudong Hao xudong@intel.com Signed-off-by: Liu Jinsong jinsong@intel.com --- arch/x86/kvm/cpuid.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index c697625..abe18b4 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -303,7 +303,7 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* cpuid 7.0.ebx */ const u32 kvm_supported_word9_x86_features = F(FSGSBASE) | F(BMI1) | F(HLE) | F(AVX2) | F(SMEP) | - F(BMI2) | F(ERMS) | f_invpcid | F(RTM); + F(BMI2) | F(ERMS) | f_invpcid | F(RTM) | F(RDSEED); /* all calls to cpuid_count() should be made on the same cpu */ get_cpu(); -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] KVM: x86: expose ADX feature to guest
From 0750e335eb5860b0b483e217e8a08bd743cbba16 Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Thu, 20 Feb 2014 17:39:32 +0800 Subject: [PATCH] KVM: x86: expose ADX feature to guest ADCX and ADOX instructions perform an unsigned addition with Carry flag and Overflow flag respectively. Signed-off-by: Xudong Hao xudong@intel.com Signed-off-by: Liu Jinsong jinsong@intel.com --- arch/x86/kvm/cpuid.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index abe18b4..a951ae4 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -303,7 +303,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* cpuid 7.0.ebx */ const u32 kvm_supported_word9_x86_features = F(FSGSBASE) | F(BMI1) | F(HLE) | F(AVX2) | F(SMEP) | - F(BMI2) | F(ERMS) | f_invpcid | F(RTM) | F(RDSEED); + F(BMI2) | F(ERMS) | f_invpcid | F(RTM) | F(RDSEED) | + F(ADX); /* all calls to cpuid_count() should be made on the same cpu */ get_cpu(); -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] KVM: x86: Fix xsave cpuid exposing bug
From 00c920c96127d20d4c3bb790082700ae375c39a0 Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Fri, 21 Feb 2014 23:47:18 +0800 Subject: [PATCH] KVM: x86: Fix xsave cpuid exposing bug EBX of cpuid(0xD, 0) is dynamic per XCR0 features enable/disable. Bit 63 of XCR0 is reserved for future expansion. Signed-off-by: Liu Jinsong jinsong@intel.com --- arch/x86/include/asm/xsave.h |2 ++ arch/x86/kvm/cpuid.c |6 +++--- arch/x86/kvm/x86.c |7 +-- 3 files changed, 10 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/xsave.h b/arch/x86/include/asm/xsave.h index 5547389..dcd047b 100644 --- a/arch/x86/include/asm/xsave.h +++ b/arch/x86/include/asm/xsave.h @@ -13,6 +13,8 @@ #define XSTATE_BNDCSR 0x10 #define XSTATE_FPSSE (XSTATE_FP | XSTATE_SSE) +/* Bit 63 of XCR0 is reserved for future expansion */ +#define XSTATE_EXTEND_MASK (~(XSTATE_FPSSE | (1ULL 63))) #define FXSAVE_SIZE512 diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index a951ae4..b241325 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -28,7 +28,7 @@ static u32 xstate_required_size(u64 xstate_bv) int feature_bit = 0; u32 ret = XSAVE_HDR_SIZE + XSAVE_HDR_OFFSET; - xstate_bv = ~XSTATE_FPSSE; + xstate_bv = XSTATE_EXTEND_MASK; while (xstate_bv) { if (xstate_bv 0x1) { u32 eax, ebx, ecx, edx; @@ -74,8 +74,8 @@ void kvm_update_cpuid(struct kvm_vcpu *vcpu) vcpu-arch.guest_supported_xcr0 = (best-eax | ((u64)best-edx 32)) host_xcr0 KVM_SUPPORTED_XCR0; - vcpu-arch.guest_xstate_size = - xstate_required_size(vcpu-arch.guest_supported_xcr0); + vcpu-arch.guest_xstate_size = best-ebx = + xstate_required_size(vcpu-arch.xcr0); } kvm_pmu_cpuid_update(vcpu); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 39c28f0..7c52acb 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -595,13 +595,13 @@ static void kvm_put_guest_xcr0(struct kvm_vcpu *vcpu) int __kvm_set_xcr(struct kvm_vcpu *vcpu, u32 index, u64 xcr) { - u64 xcr0; + u64 xcr0 = xcr; + u64 old_xcr0 = vcpu-arch.xcr0; u64 valid_bits; /* Only support XCR_XFEATURE_ENABLED_MASK(xcr0) now */ if (index != XCR_XFEATURE_ENABLED_MASK) return 1; - xcr0 = xcr; if (!(xcr0 XSTATE_FP)) return 1; if ((xcr0 XSTATE_YMM) !(xcr0 XSTATE_SSE)) @@ -618,6 +618,9 @@ int __kvm_set_xcr(struct kvm_vcpu *vcpu, u32 index, u64 xcr) kvm_put_guest_xcr0(vcpu); vcpu-arch.xcr0 = xcr0; + + if ((xcr0 ^ old_xcr0) XSTATE_EXTEND_MASK) + kvm_update_cpuid(vcpu); return 0; } -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v4 0/3] KVM: x86: enable Intel MPX for KVM
These patches are version 4 to enable Intel MPX for KVM. Version 1: * Add some Intel MPX definiation * Fix a cpuid(0x0d, 0) exposing bug, dynamic per XCR0 features enable/disable * vmx and msr handle for MPX support at KVM * enalbe MPX feature for guest Version 2: * remove generic MPX definiation, Qiaowei's patch has add the definiation at kernel side * add MSR_IA32_BNDCFGS to msrs_to_save Version 3: * rebase on latest kernel, which include Qiaowei's MPX common definiation pulled from HPA's tree Version 4: * Remove xsave bugfix patch from this series as a standalone patch * Add a new kvm_x86_ops member mpx_supported, to disable MPX whenever the two VMX controls are not available * minor rebase for VMX bit definition Thanks, Jinsong-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v4 1/3] KVM: x86: Intel MPX vmx and msr handle
From eb56f19c14d5603209b22b97cd53ef1716bf2804 Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Sat, 22 Feb 2014 07:53:32 +0800 Subject: [PATCH v4 1/3] KVM: x86: Intel MPX vmx and msr handle This patch handle vmx and msr of Intel MPX feature. Signed-off-by: Xudong Hao xudong@intel.com Signed-off-by: Liu Jinsong jinsong@intel.com --- arch/x86/include/asm/kvm_host.h |1 + arch/x86/include/asm/vmx.h|4 arch/x86/include/uapi/asm/msr-index.h |1 + arch/x86/kvm/vmx.c| 18 -- 4 files changed, 22 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index fdf83af..e605b71 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -765,6 +765,7 @@ struct kvm_x86_ops { struct x86_instruction_info *info, enum x86_intercept_stage stage); void (*handle_external_intr)(struct kvm_vcpu *vcpu); + int (*mpx_supported)(void); }; struct kvm_arch_async_pf { diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h index 2067264..7004d21 100644 --- a/arch/x86/include/asm/vmx.h +++ b/arch/x86/include/asm/vmx.h @@ -85,6 +85,7 @@ #define VM_EXIT_SAVE_IA32_EFER 0x0010 #define VM_EXIT_LOAD_IA32_EFER 0x0020 #define VM_EXIT_SAVE_VMX_PREEMPTION_TIMER 0x0040 +#define VM_EXIT_CLEAR_BNDCFGS 0x0080 #define VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR 0x00036dff @@ -95,6 +96,7 @@ #define VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL 0x2000 #define VM_ENTRY_LOAD_IA32_PAT 0x4000 #define VM_ENTRY_LOAD_IA32_EFER 0x8000 +#define VM_ENTRY_LOAD_BNDCFGS 0x0001 #define VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR 0x11ff @@ -174,6 +176,8 @@ enum vmcs_field { GUEST_PDPTR2_HIGH = 0x280f, GUEST_PDPTR3= 0x2810, GUEST_PDPTR3_HIGH = 0x2811, + GUEST_BNDCFGS = 0x2812, + GUEST_BNDCFGS_HIGH = 0x2813, HOST_IA32_PAT = 0x2c00, HOST_IA32_PAT_HIGH = 0x2c01, HOST_IA32_EFER = 0x2c02, diff --git a/arch/x86/include/uapi/asm/msr-index.h b/arch/x86/include/uapi/asm/msr-index.h index c19fc60..ed821ed 100644 --- a/arch/x86/include/uapi/asm/msr-index.h +++ b/arch/x86/include/uapi/asm/msr-index.h @@ -295,6 +295,7 @@ #define MSR_SMI_COUNT 0x0034 #define MSR_IA32_FEATURE_CONTROL0x003a #define MSR_IA32_TSC_ADJUST 0x003b +#define MSR_IA32_BNDCFGS 0x0d90 #define FEATURE_CONTROL_LOCKED (10) #define FEATURE_CONTROL_VMXON_ENABLED_INSIDE_SMX (11) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index a06f101..35d285f 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -441,6 +441,7 @@ struct vcpu_vmx { #endif int gs_ldt_reload_needed; int fs_reload_needed; + u64 msr_host_bndcfgs; } host_state; struct { int vm86_active; @@ -1710,6 +1711,8 @@ static void vmx_save_host_state(struct kvm_vcpu *vcpu) if (is_long_mode(vmx-vcpu)) wrmsrl(MSR_KERNEL_GS_BASE, vmx-msr_guest_kernel_gs_base); #endif + if (boot_cpu_has(X86_FEATURE_MPX)) + rdmsrl(MSR_IA32_BNDCFGS, vmx-host_state.msr_host_bndcfgs); for (i = 0; i vmx-save_nmsrs; ++i) kvm_set_shared_msr(vmx-guest_msrs[i].index, vmx-guest_msrs[i].data, @@ -1747,6 +1750,8 @@ static void __vmx_load_host_state(struct vcpu_vmx *vmx) #ifdef CONFIG_X86_64 wrmsrl(MSR_KERNEL_GS_BASE, vmx-msr_host_kernel_gs_base); #endif + if (vmx-host_state.msr_host_bndcfgs) + wrmsrl(MSR_IA32_BNDCFGS, vmx-host_state.msr_host_bndcfgs); /* * If the FPU is not active (through the host task or * the guest vcpu), then restore the cr0.TS bit. @@ -2837,7 +2842,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf) min |= VM_EXIT_HOST_ADDR_SPACE_SIZE; #endif opt = VM_EXIT_SAVE_IA32_PAT | VM_EXIT_LOAD_IA32_PAT | - VM_EXIT_ACK_INTR_ON_EXIT; + VM_EXIT_ACK_INTR_ON_EXIT | VM_EXIT_CLEAR_BNDCFGS; if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_EXIT_CTLS, _vmexit_control) 0) return -EIO; @@ -2854,7 +2859,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf) _pin_based_exec_control = ~PIN_BASED_POSTED_INTR; min = 0; - opt = VM_ENTRY_LOAD_IA32_PAT; + opt = VM_ENTRY_LOAD_IA32_PAT | VM_ENTRY_LOAD_BNDCFGS
[PATCH v4 2/3] KVM: x86: add MSR_IA32_BNDCFGS to msrs_to_save
From 7d1b41c3fdf71e4c73280e117948102f54f74be7 Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Sat, 22 Feb 2014 08:10:17 +0800 Subject: [PATCH v4 2/3] KVM: x86: add MSR_IA32_BNDCFGS to msrs_to_save Add MSR_IA32_BNDCFGS to msrs_to_save, and corresponding logic to kvm_get/set_msr(). Signed-off-by: Liu Jinsong jinsong@intel.com --- arch/x86/kvm/vmx.c |6 ++ arch/x86/kvm/x86.c |2 +- 2 files changed, 7 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 35d285f..2839c2f 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -2484,6 +2484,9 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata) case MSR_IA32_SYSENTER_ESP: data = vmcs_readl(GUEST_SYSENTER_ESP); break; + case MSR_IA32_BNDCFGS: + data = vmcs_readl(GUEST_BNDCFGS); + break; case MSR_IA32_FEATURE_CONTROL: if (!nested_vmx_allowed(vcpu)) return 1; @@ -2552,6 +2555,9 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) case MSR_IA32_SYSENTER_ESP: vmcs_writel(GUEST_SYSENTER_ESP, data); break; + case MSR_IA32_BNDCFGS: + vmcs_writel(GUEST_BNDCFGS, data); + break; case MSR_IA32_TSC: kvm_write_tsc(vcpu, msr_info); break; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 7c52acb..89e4e27 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -882,7 +882,7 @@ static u32 msrs_to_save[] = { MSR_CSTAR, MSR_KERNEL_GS_BASE, MSR_SYSCALL_MASK, MSR_LSTAR, #endif MSR_IA32_TSC, MSR_IA32_CR_PAT, MSR_VM_HSAVE_PA, - MSR_IA32_FEATURE_CONTROL + MSR_IA32_FEATURE_CONTROL, MSR_IA32_BNDCFGS }; static unsigned num_msrs_to_save; -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v4 3/3] KVM: x86: Enable Intel MPX for guest
From 8b3a3b1f08c166e0c2cdc6162e6fa95d9c7ad2ec Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Sat, 22 Feb 2014 08:53:27 +0800 Subject: [PATCH v4 3/3] KVM: x86: Enable Intel MPX for guest This patch enable Intel MPX feature to guest. Signed-off-by: Xudong Hao xudong@intel.com Signed-off-by: Liu Jinsong jinsong@intel.com --- arch/x86/kvm/cpuid.c |2 +- arch/x86/kvm/x86.c |6 ++ arch/x86/kvm/x86.h |3 ++- 3 files changed, 9 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index b241325..b377d83 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -303,7 +303,7 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* cpuid 7.0.ebx */ const u32 kvm_supported_word9_x86_features = F(FSGSBASE) | F(BMI1) | F(HLE) | F(AVX2) | F(SMEP) | - F(BMI2) | F(ERMS) | f_invpcid | F(RTM) | F(RDSEED) | + F(BMI2) | F(ERMS) | f_invpcid | F(RTM) | F(MPX) | F(RDSEED) | F(ADX); /* all calls to cpuid_count() should be made on the same cpu */ diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 89e4e27..3570e71 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -599,6 +599,9 @@ int __kvm_set_xcr(struct kvm_vcpu *vcpu, u32 index, u64 xcr) u64 old_xcr0 = vcpu-arch.xcr0; u64 valid_bits; + if (!kvm_x86_ops-mpx_supported || !kvm_x86_ops-mpx_supported()) + xcr0 = ~(XSTATE_BNDREGS | XSTATE_BNDCSR); + /* Only support XCR_XFEATURE_ENABLED_MASK(xcr0) now */ if (index != XCR_XFEATURE_ENABLED_MASK) return 1; @@ -616,6 +619,9 @@ int __kvm_set_xcr(struct kvm_vcpu *vcpu, u32 index, u64 xcr) if (xcr0 ~valid_bits) return 1; + if ((!(xcr0 XSTATE_BNDREGS)) != (!(xcr0 XSTATE_BNDCSR))) + return 1; + kvm_put_guest_xcr0(vcpu); vcpu-arch.xcr0 = xcr0; diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 8da5823..392ecbf 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -122,7 +122,8 @@ int kvm_write_guest_virt_system(struct x86_emulate_ctxt *ctxt, gva_t addr, void *val, unsigned int bytes, struct x86_exception *exception); -#define KVM_SUPPORTED_XCR0 (XSTATE_FP | XSTATE_SSE | XSTATE_YMM) +#define KVM_SUPPORTED_XCR0 (XSTATE_FP | XSTATE_SSE | XSTATE_YMM \ + | XSTATE_BNDREGS | XSTATE_BNDCSR) extern u64 host_xcr0; extern unsigned int min_timer_period_us; -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH v3 0/4] X86/KVM: enable Intel MPX for KVM
Paolo Bonzini wrote: Il 22/01/2014 13:03, Paolo Bonzini ha scritto: Il 22/01/2014 06:29, Liu, Jinsong ha scritto: These patches are version 3 to enalbe Intel MPX for KVM. Version 1: * Add some Intel MPX definiation * Fix a cpuid(0x0d, 0) exposing bug, dynamic per XCR0 features enable/disable * vmx and msr handle for MPX support at KVM * enalbe MPX feature for guest Version 2: * remove generic MPX definiation, Qiaowei's patch has add the definiation at kernel side * add MSR_IA32_BNDCFGS to msrs_to_save Version 3: * rebase on latest kernel, which include Qiaowei's MPX common definiation pulled from HPA's tree I am afraid there is still some work to do on these patches, so they need to be delayed to 3.15. Patch 1: this seems mostly separate from the rest of the MPX work. I commented on the missing ULL suffix, but I would also like to understand why you put this patch in this series. Patch 2: As remarked in the reply to this patch: - the vmx_disable_intercept_for_msr has to be unconditional - you need a new kvm_x86_ops member mpx_supported, to disable MPX whenever the two VMX controls are not available. Patch 3: this patch needs to be rebased. Apart from that it is fine, but please move the VMX bits together with patch 2, and the other bits together with patch 4. Patch 4: this patch needs to be rebased and to use the new mpx_supported member If you also want to look at nested VMX support for MPX, that would be nice. It should not be hard. Otherwise we can take care of that later. Thanks for your work, Paolo Are you going to send v4? Paolo Yes, I have just sent out v4 patches per your comments. Thanks, Jinsong-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH v3 0/4] X86/KVM: enable Intel MPX for KVM
Paolo Bonzini wrote: Il 22/01/2014 13:03, Paolo Bonzini ha scritto: Il 22/01/2014 06:29, Liu, Jinsong ha scritto: These patches are version 3 to enalbe Intel MPX for KVM. Version 1: * Add some Intel MPX definiation * Fix a cpuid(0x0d, 0) exposing bug, dynamic per XCR0 features enable/disable * vmx and msr handle for MPX support at KVM * enalbe MPX feature for guest Version 2: * remove generic MPX definiation, Qiaowei's patch has add the definiation at kernel side * add MSR_IA32_BNDCFGS to msrs_to_save Version 3: * rebase on latest kernel, which include Qiaowei's MPX common definiation pulled from HPA's tree I am afraid there is still some work to do on these patches, so they need to be delayed to 3.15. Patch 1: this seems mostly separate from the rest of the MPX work. I commented on the missing ULL suffix, but I would also like to understand why you put this patch in this series. Patch 2: As remarked in the reply to this patch: - the vmx_disable_intercept_for_msr has to be unconditional - you need a new kvm_x86_ops member mpx_supported, to disable MPX whenever the two VMX controls are not available. Patch 3: this patch needs to be rebased. Apart from that it is fine, but please move the VMX bits together with patch 2, and the other bits together with patch 4. Patch 4: this patch needs to be rebased and to use the new mpx_supported member If you also want to look at nested VMX support for MPX, that would be nice. It should not be hard. Otherwise we can take care of that later. Thanks for your work, Paolo Are you going to send v4? Paolo Yes, I just return from long Chinese Spring Festival, I will send V4 later. Thanks, Jinsong-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH v2 0/4] X86/KVM: enable Intel MPX for KVM
Paolo Bonzini wrote: Il 12/12/2013 12:09, Liu, Jinsong ha scritto: Paolo Bonzini wrote: Il 12/12/2013 06:47, Liu, Jinsong ha scritto: Paolo Bonzini wrote: Il 11/12/2013 09:31, Liu, Jinsong ha scritto: Paolo, comments for version 2? I think I commented that it's fine, I'm just waiting for a rebase on top of the generic patches. Paolo Thanks! common MPX definiation patches have been checked in tip tree (both Qiaowei and I use that definiations): http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?id=191f57c137bcce0e3e9313acb77b2f114d15afbb http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?id=e7d820a5e549b3eb6c3f9467507566565646a669 Ok, can you rebase and resend? Paolo Sure, I have pulled and rebased on git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git No confliction, patches v3 will send out later. Didn't see this... You still have a couple of days to send it out and address the review remarks. Paolo Hmm? I remember I have sent out the rebased patches v3 last month ... If you didn't receive them I'm OK to rebase and resend them. BTW, what's the review remarks? I remember you commented that the patches are fine. Any misunderstanding please point out to me. Thanks, Jinsong-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH v2 0/4] X86/KVM: enable Intel MPX for KVM
Paolo Bonzini wrote: Il 21/01/2014 16:25, Liu, Jinsong ha scritto: Hmm? I remember I have sent out the rebased patches v3 last month If you didn't receive them I'm OK to rebase and resend them. BTW, what's the review remarks? I remember you commented that the patches are fine. Any misunderstanding please point out to me. You sent v3 of QEMU, but not of KVM. I don't see any mail from you after December 12th on kvm@vger.kernel.org. I can see my comment that the patches were fine (apart from needing a rebase), but the threading was wrong and I cannot find anymore _which_ patches they were. I did find a comment that BNDCFGS must be added to msrs_to_save, but I don't know if that was v1 or v2 because you didn't add the version number when sending v2. Paolo So I don't need resend qemu patches, just rebase and resend KVM MPX patches, is that right? Thanks, Jinsong-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 1/4] KVM/X86: Fix xsave cpuid exposing bug
From 3155a190ce6ebb213e6c724240f4e6620ba67a9d Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Fri, 13 Dec 2013 02:32:03 +0800 Subject: [PATCH v3 1/4] KVM/X86: Fix xsave cpuid exposing bug EBX of cpuid(0xD, 0) is dynamic per XCR0 features enable/disable. Bit 63 of XCR0 is reserved for future expansion. Signed-off-by: Liu Jinsong jinsong@intel.com --- arch/x86/include/asm/xsave.h |2 ++ arch/x86/kvm/cpuid.c |6 +++--- arch/x86/kvm/x86.c |7 +-- 3 files changed, 10 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/xsave.h b/arch/x86/include/asm/xsave.h index 5547389..f6c4e85 100644 --- a/arch/x86/include/asm/xsave.h +++ b/arch/x86/include/asm/xsave.h @@ -13,6 +13,8 @@ #define XSTATE_BNDCSR 0x10 #define XSTATE_FPSSE (XSTATE_FP | XSTATE_SSE) +/* Bit 63 of XCR0 is reserved for future expansion */ +#define XSTATE_EXTEND_MASK (~(XSTATE_FPSSE | (1 63))) #define FXSAVE_SIZE512 diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index c697625..2d661e6 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -28,7 +28,7 @@ static u32 xstate_required_size(u64 xstate_bv) int feature_bit = 0; u32 ret = XSAVE_HDR_SIZE + XSAVE_HDR_OFFSET; - xstate_bv = ~XSTATE_FPSSE; + xstate_bv = XSTATE_EXTEND_MASK; while (xstate_bv) { if (xstate_bv 0x1) { u32 eax, ebx, ecx, edx; @@ -74,8 +74,8 @@ void kvm_update_cpuid(struct kvm_vcpu *vcpu) vcpu-arch.guest_supported_xcr0 = (best-eax | ((u64)best-edx 32)) host_xcr0 KVM_SUPPORTED_XCR0; - vcpu-arch.guest_xstate_size = - xstate_required_size(vcpu-arch.guest_supported_xcr0); + vcpu-arch.guest_xstate_size = best-ebx = + xstate_required_size(vcpu-arch.xcr0); } kvm_pmu_cpuid_update(vcpu); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 21ef1ba..1657ca2 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -576,13 +576,13 @@ static void kvm_put_guest_xcr0(struct kvm_vcpu *vcpu) int __kvm_set_xcr(struct kvm_vcpu *vcpu, u32 index, u64 xcr) { - u64 xcr0; + u64 xcr0 = xcr; + u64 old_xcr0 = vcpu-arch.xcr0; u64 valid_bits; /* Only support XCR_XFEATURE_ENABLED_MASK(xcr0) now */ if (index != XCR_XFEATURE_ENABLED_MASK) return 1; - xcr0 = xcr; if (!(xcr0 XSTATE_FP)) return 1; if ((xcr0 XSTATE_YMM) !(xcr0 XSTATE_SSE)) @@ -599,6 +599,9 @@ int __kvm_set_xcr(struct kvm_vcpu *vcpu, u32 index, u64 xcr) kvm_put_guest_xcr0(vcpu); vcpu-arch.xcr0 = xcr0; + + if ((xcr0 ^ old_xcr0) XSTATE_EXTEND_MASK) + kvm_update_cpuid(vcpu); return 0; } -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 4/4] KVM/X86: Enable Intel MPX for guest
From c2b3b4347b4c8b0aa6b5e97c161fd4d34b0ef4d3 Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Fri, 13 Dec 2013 02:34:48 +0800 Subject: [PATCH v3 4/4] KVM/X86: Enable Intel MPX for guest. This patch enable Intel MPX feature to guest. Signed-off-by: Xudong Hao xudong@intel.com Signed-off-by: Liu Jinsong jinsong@intel.com --- arch/x86/kvm/cpuid.c |2 +- arch/x86/kvm/x86.c |3 +++ arch/x86/kvm/x86.h |3 ++- 3 files changed, 6 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 2d661e6..e30d4ce 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -303,7 +303,7 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* cpuid 7.0.ebx */ const u32 kvm_supported_word9_x86_features = F(FSGSBASE) | F(BMI1) | F(HLE) | F(AVX2) | F(SMEP) | - F(BMI2) | F(ERMS) | f_invpcid | F(RTM); + F(BMI2) | F(ERMS) | f_invpcid | F(RTM) | F(MPX); /* all calls to cpuid_count() should be made on the same cpu */ get_cpu(); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 8ca2269..410421a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -597,6 +597,9 @@ int __kvm_set_xcr(struct kvm_vcpu *vcpu, u32 index, u64 xcr) if (xcr0 ~valid_bits) return 1; + if ((!(xcr0 XSTATE_BNDREGS)) != (!(xcr0 XSTATE_BNDCSR))) + return 1; + kvm_put_guest_xcr0(vcpu); vcpu-arch.xcr0 = xcr0; diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 587fb9e..985e40e 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -122,7 +122,8 @@ int kvm_write_guest_virt_system(struct x86_emulate_ctxt *ctxt, gva_t addr, void *val, unsigned int bytes, struct x86_exception *exception); -#define KVM_SUPPORTED_XCR0 (XSTATE_FP | XSTATE_SSE | XSTATE_YMM) +#define KVM_SUPPORTED_XCR0 (XSTATE_FP | XSTATE_SSE | XSTATE_YMM \ + | XSTATE_BNDREGS | XSTATE_BNDCSR) extern u64 host_xcr0; extern struct static_key kvm_no_apic_vcpu; -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 2/4] KVM/X86: Intel MPX vmx and msr handle
From 31e68d752ac395dc6b65e6adf45be5324e92cdc8 Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Fri, 13 Dec 2013 02:32:43 +0800 Subject: [PATCH v3 2/4] KVM/X86: Intel MPX vmx and msr handle This patch handle vmx and msr of Intel MPX feature. Signed-off-by: Xudong Hao xudong@intel.com Signed-off-by: Liu Jinsong jinsong@intel.com --- arch/x86/include/asm/vmx.h|2 ++ arch/x86/include/uapi/asm/msr-index.h |1 + arch/x86/kvm/vmx.c| 12 ++-- 3 files changed, 13 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h index 966502d..1bf4681 100644 --- a/arch/x86/include/asm/vmx.h +++ b/arch/x86/include/asm/vmx.h @@ -85,6 +85,7 @@ #define VM_EXIT_SAVE_IA32_EFER 0x0010 #define VM_EXIT_LOAD_IA32_EFER 0x0020 #define VM_EXIT_SAVE_VMX_PREEMPTION_TIMER 0x0040 +#define VM_EXIT_CLEAR_BNDCFGS 0x0080 #define VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR 0x00036dff @@ -95,6 +96,7 @@ #define VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL 0x2000 #define VM_ENTRY_LOAD_IA32_PAT 0x4000 #define VM_ENTRY_LOAD_IA32_EFER 0x8000 +#define VM_ENTRY_LOAD_BNDCFGS 0x0001 #define VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR 0x11ff diff --git a/arch/x86/include/uapi/asm/msr-index.h b/arch/x86/include/uapi/asm/msr-index.h index 37813b5..2a418c4 100644 --- a/arch/x86/include/uapi/asm/msr-index.h +++ b/arch/x86/include/uapi/asm/msr-index.h @@ -294,6 +294,7 @@ #define MSR_SMI_COUNT 0x0034 #define MSR_IA32_FEATURE_CONTROL0x003a #define MSR_IA32_TSC_ADJUST 0x003b +#define MSR_IA32_BNDCFGS 0x0d90 #define FEATURE_CONTROL_LOCKED (10) #define FEATURE_CONTROL_VMXON_ENABLED_INSIDE_SMX (11) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index b2fe1c2..6d7d9ad 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -439,6 +439,7 @@ struct vcpu_vmx { #endif int gs_ldt_reload_needed; int fs_reload_needed; + u64 msr_host_bndcfgs; } host_state; struct { int vm86_active; @@ -1647,6 +1648,8 @@ static void vmx_save_host_state(struct kvm_vcpu *vcpu) if (is_long_mode(vmx-vcpu)) wrmsrl(MSR_KERNEL_GS_BASE, vmx-msr_guest_kernel_gs_base); #endif + if (boot_cpu_has(X86_FEATURE_MPX)) + rdmsrl(MSR_IA32_BNDCFGS, vmx-host_state.msr_host_bndcfgs); for (i = 0; i vmx-save_nmsrs; ++i) kvm_set_shared_msr(vmx-guest_msrs[i].index, vmx-guest_msrs[i].data, @@ -1684,6 +1687,8 @@ static void __vmx_load_host_state(struct vcpu_vmx *vmx) #ifdef CONFIG_X86_64 wrmsrl(MSR_KERNEL_GS_BASE, vmx-msr_host_kernel_gs_base); #endif + if (vmx-host_state.msr_host_bndcfgs) + wrmsrl(MSR_IA32_BNDCFGS, vmx-host_state.msr_host_bndcfgs); /* * If the FPU is not active (through the host task or * the guest vcpu), then restore the cr0.TS bit. @@ -2800,7 +2805,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf) min |= VM_EXIT_HOST_ADDR_SPACE_SIZE; #endif opt = VM_EXIT_SAVE_IA32_PAT | VM_EXIT_LOAD_IA32_PAT | - VM_EXIT_ACK_INTR_ON_EXIT; + VM_EXIT_ACK_INTR_ON_EXIT | VM_EXIT_CLEAR_BNDCFGS; if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_EXIT_CTLS, _vmexit_control) 0) return -EIO; @@ -2817,7 +2822,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf) _pin_based_exec_control = ~PIN_BASED_POSTED_INTR; min = 0; - opt = VM_ENTRY_LOAD_IA32_PAT; + opt = VM_ENTRY_LOAD_IA32_PAT | VM_ENTRY_LOAD_BNDCFGS; if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_ENTRY_CTLS, _vmentry_control) 0) return -EIO; @@ -8636,6 +8641,9 @@ static int __init vmx_init(void) vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_CS, false); vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_ESP, false); vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_EIP, false); + if (boot_cpu_has(X86_FEATURE_MPX)) + vmx_disable_intercept_for_msr(MSR_IA32_BNDCFGS, true); + memcpy(vmx_msr_bitmap_legacy_x2apic, vmx_msr_bitmap_legacy, PAGE_SIZE); memcpy(vmx_msr_bitmap_longmode_x2apic, -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 3/4] KVM/X86: add MSR_IA32_BNDCFGS to msrs_to_save
From d1992769911f34cb319fe638d32ae604bd2a6ce8 Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Fri, 13 Dec 2013 02:33:08 +0800 Subject: [PATCH v3 3/4] KVM/X86: add MSR_IA32_BNDCFGS to msrs_to_save Add MSR_IA32_BNDCFGS to msrs_to_save, and corresponding logic to kvm_get/set_msr(). Signed-off-by: Liu Jinsong jinsong@intel.com --- arch/x86/include/asm/vmx.h |2 ++ arch/x86/kvm/vmx.c |6 ++ arch/x86/kvm/x86.c |2 +- 3 files changed, 9 insertions(+), 1 deletions(-) diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h index 1bf4681..47b4c88 100644 --- a/arch/x86/include/asm/vmx.h +++ b/arch/x86/include/asm/vmx.h @@ -175,6 +175,8 @@ enum vmcs_field { GUEST_PDPTR2_HIGH = 0x280f, GUEST_PDPTR3= 0x2810, GUEST_PDPTR3_HIGH = 0x2811, + GUEST_BNDCFGS = 0x2812, + GUEST_BNDCFGS_HIGH = 0x2813, HOST_IA32_PAT = 0x2c00, HOST_IA32_PAT_HIGH = 0x2c01, HOST_IA32_EFER = 0x2c02, diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 6d7d9ad..a6a7556 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -2465,6 +2465,9 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata) case MSR_IA32_SYSENTER_ESP: data = vmcs_readl(GUEST_SYSENTER_ESP); break; + case MSR_IA32_BNDCFGS: + data = vmcs_readl(GUEST_BNDCFGS); + break; case MSR_TSC_AUX: if (!to_vmx(vcpu)-rdtscp_enabled) return 1; @@ -2524,6 +2527,9 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) case MSR_IA32_SYSENTER_ESP: vmcs_writel(GUEST_SYSENTER_ESP, data); break; + case MSR_IA32_BNDCFGS: + vmcs_writel(GUEST_BNDCFGS, data); + break; case MSR_IA32_TSC: kvm_write_tsc(vcpu, msr_info); break; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 1657ca2..8ca2269 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -852,7 +852,7 @@ static u32 msrs_to_save[] = { MSR_CSTAR, MSR_KERNEL_GS_BASE, MSR_SYSCALL_MASK, MSR_LSTAR, #endif MSR_IA32_TSC, MSR_IA32_CR_PAT, MSR_VM_HSAVE_PA, - MSR_IA32_FEATURE_CONTROL + MSR_IA32_FEATURE_CONTROL, MSR_IA32_BNDCFGS }; static unsigned num_msrs_to_save; -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 0/4] X86/KVM: enable Intel MPX for KVM
These patches are version 3 to enalbe Intel MPX for KVM. Version 1: * Add some Intel MPX definiation * Fix a cpuid(0x0d, 0) exposing bug, dynamic per XCR0 features enable/disable * vmx and msr handle for MPX support at KVM * enalbe MPX feature for guest Version 2: * remove generic MPX definiation, Qiaowei's patch has add the definiation at kernel side * add MSR_IA32_BNDCFGS to msrs_to_save Version 3: * rebase on latest kernel, which include Qiaowei's MPX common definiation pulled from HPA's tree Thanks, Jinsong-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH v2 0/4] X86/KVM: enable Intel MPX for KVM
Paolo Bonzini wrote: Il 12/12/2013 06:47, Liu, Jinsong ha scritto: Paolo Bonzini wrote: Il 11/12/2013 09:31, Liu, Jinsong ha scritto: Paolo, comments for version 2? I think I commented that it's fine, I'm just waiting for a rebase on top of the generic patches. Paolo Thanks! common MPX definiation patches have been checked in tip tree (both Qiaowei and I use that definiations): http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?id=191f57c137bcce0e3e9313acb77b2f114d15afbb http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?id=e7d820a5e549b3eb6c3f9467507566565646a669 Ok, can you rebase and resend? Paolo Sure, I have pulled and rebased on git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git No confliction, patches v3 will send out later. Thanks, Jinsong-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH v2 0/4] X86/KVM: enable Intel MPX for KVM
Paolo, comments for version 2? Thanks, Jinsong Liu, Jinsong wrote: These patches are version 2 to enalbe Intel MPX for KVM. Version 1: * Add some Intel MPX definiation * Fix a cpuid(0x0d, 0) exposing bug, dynamic per XCR0 features enable/disable * vmx and msr handle for MPX support at KVM * enalbe MPX feature for guest Version 2: * remove generic MPX definiation, kernel side has add the definiation * add MSR_IA32_BNDCFGS to msrs_to_save Thanks, Jinsong Liu Jinsong (4): KVM/X86: Fix xsave cpuid exposing bug KVM/X86: Intel MPX vmx and msr handle KVM/X86: add MSR_IA32_BNDCFGS to msrs_to_save KVM/X86: Enable Intel MPX for guest. arch/x86/include/asm/vmx.h|4 arch/x86/include/asm/xsave.h |2 ++ arch/x86/include/uapi/asm/msr-index.h |1 + arch/x86/kvm/cpuid.c |8 arch/x86/kvm/vmx.c| 18 -- arch/x86/kvm/x86.c| 12 +--- arch/x86/kvm/x86.h|3 ++- 7 files changed, 38 insertions(+), 10 deletions(-) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH v2 0/4] X86/KVM: enable Intel MPX for KVM
Paolo Bonzini wrote: Il 11/12/2013 09:31, Liu, Jinsong ha scritto: Paolo, comments for version 2? I think I commented that it's fine, I'm just waiting for a rebase on top of the generic patches. Paolo Thanks! common MPX definiation patches have been checked in tip tree (both Qiaowei and I use that definiations): http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?id=191f57c137bcce0e3e9313acb77b2f114d15afbb http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?id=e7d820a5e549b3eb6c3f9467507566565646a669 Jinsong Liu, Jinsong wrote: These patches are version 2 to enalbe Intel MPX for KVM. Version 1: * Add some Intel MPX definiation * Fix a cpuid(0x0d, 0) exposing bug, dynamic per XCR0 features enable/disable * vmx and msr handle for MPX support at KVM * enalbe MPX feature for guest Version 2: * remove generic MPX definiation, kernel side has add the definiation * add MSR_IA32_BNDCFGS to msrs_to_save Thanks, Jinsong Liu Jinsong (4): KVM/X86: Fix xsave cpuid exposing bug KVM/X86: Intel MPX vmx and msr handle KVM/X86: add MSR_IA32_BNDCFGS to msrs_to_save KVM/X86: Enable Intel MPX for guest. arch/x86/include/asm/vmx.h|4 arch/x86/include/asm/xsave.h |2 ++ arch/x86/include/uapi/asm/msr-index.h |1 + arch/x86/kvm/cpuid.c |8 arch/x86/kvm/vmx.c| 18 -- arch/x86/kvm/x86.c| 12 +--- arch/x86/kvm/x86.h|3 ++- 7 files changed, 38 insertions(+), 10 deletions(-) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 0/2] Intel MPX feature support at Qemu
Intel has released Memory Protection Extensions (MPX) recently. Please refer to http://download-software.intel.com/sites/default/files/319433-015.pdf These 2 patches are version2 to support Intel MPX at qemu side. Version 1: * Fix cpuid leaf 0x0d bug which incorrectly parsed eax and ebx; * Expose cpuid leaf (0xd, 3) and (0xd, 4) to guest; Version 2: * Add comments to explain cpuid error parse (of current qemu) didn't generate wrong result; * Add some MPX related definiation, and hardcode sizes and offsets of xsave features 3 and 4. It also add corresponding part to kvm_get/put_xsave. Version 3: * patch v2 1/2 (bug fix) has been checked in qemu; * add vmstate for migration; * add 1 new patch for bndcfgs msr; Thanks, Jinsong-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 1/2] target-i386: Intel MPX
From ee8b72df3b5503514b748035e6b1cb4d61f8e701 Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Thu, 5 Dec 2013 08:32:12 +0800 Subject: [PATCH v3 1/2] target-i386: Intel MPX Add some MPX related definiation, and hardcode sizes and offsets of xsave features 3 and 4. It also add corresponding part to kvm_get/put_xsave, and vmstate. Signed-off-by: Liu Jinsong jinsong@intel.com --- target-i386/cpu.c |4 target-i386/cpu.h | 22 +++--- target-i386/kvm.c | 10 ++ target-i386/machine.c | 46 ++ 4 files changed, 79 insertions(+), 3 deletions(-) diff --git a/target-i386/cpu.c b/target-i386/cpu.c index bb98f6d..5076a94 100644 --- a/target-i386/cpu.c +++ b/target-i386/cpu.c @@ -336,6 +336,10 @@ typedef struct ExtSaveArea { static const ExtSaveArea ext_save_areas[] = { [2] = { .feature = FEAT_1_ECX, .bits = CPUID_EXT_AVX, .offset = 0x240, .size = 0x100 }, +[3] = { .feature = FEAT_7_0_EBX, .bits = CPUID_7_0_EBX_MPX, +.offset = 0x3c0, .size = 0x40 }, +[4] = { .feature = FEAT_7_0_EBX, .bits = CPUID_7_0_EBX_MPX, +.offset = 0x400, .size = 0x10 }, }; const char *get_register_name_32(unsigned int reg) diff --git a/target-i386/cpu.h b/target-i386/cpu.h index ea373e8..c28b901 100644 --- a/target-i386/cpu.h +++ b/target-i386/cpu.h @@ -380,9 +380,12 @@ #define MSR_VM_HSAVE_PA 0xc0010117 -#define XSTATE_FP 1 -#define XSTATE_SSE 2 -#define XSTATE_YMM 4 +#define XSTATE_FP (1ULL 0) +#define XSTATE_SSE (1ULL 1) +#define XSTATE_YMM (1ULL 2) +#define XSTATE_BNDREGS (1ULL 3) +#define XSTATE_BNDCSR (1ULL 4) + /* CPUID feature words */ typedef enum FeatureWord { @@ -545,6 +548,7 @@ typedef uint32_t FeatureWordArray[FEATURE_WORDS]; #define CPUID_7_0_EBX_ERMS (1 9) #define CPUID_7_0_EBX_INVPCID (1 10) #define CPUID_7_0_EBX_RTM (1 11) +#define CPUID_7_0_EBX_MPX (1 14) #define CPUID_7_0_EBX_RDSEED (1 18) #define CPUID_7_0_EBX_ADX (1 19) #define CPUID_7_0_EBX_SMAP (1 20) @@ -695,6 +699,16 @@ typedef union { uint64_t q; } MMXReg; +typedef struct BNDReg { +uint64_t lb; +uint64_t ub; +} BNDReg; + +typedef struct BNDCSReg { +uint64_t cfgu; +uint64_t sts; +} BNDCSReg; + #ifdef HOST_WORDS_BIGENDIAN #define XMM_B(n) _b[15 - (n)] #define XMM_W(n) _w[7 - (n)] @@ -912,6 +926,8 @@ typedef struct CPUX86State { uint64_t xstate_bv; XMMReg ymmh_regs[CPU_NB_REGS]; +BNDReg bnd_regs[4]; +BNDCSReg bndcs_regs; uint64_t xcr0; diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 1188482..ff913ff 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -975,6 +975,8 @@ static int kvm_put_fpu(X86CPU *cpu) #define XSAVE_XMM_SPACE 40 #define XSAVE_XSTATE_BV 128 #define XSAVE_YMMH_SPACE 144 +#define XSAVE_BNDREGS 240 +#define XSAVE_BNDCSR 256 static int kvm_put_xsave(X86CPU *cpu) { @@ -1007,6 +1009,10 @@ static int kvm_put_xsave(X86CPU *cpu) *(uint64_t *)xsave-region[XSAVE_XSTATE_BV] = env-xstate_bv; memcpy(xsave-region[XSAVE_YMMH_SPACE], env-ymmh_regs, sizeof env-ymmh_regs); +memcpy(xsave-region[XSAVE_BNDREGS], env-bnd_regs, +sizeof env-bnd_regs); +memcpy(xsave-region[XSAVE_BNDCSR], env-bndcs_regs, +sizeof(env-bndcs_regs)); r = kvm_vcpu_ioctl(CPU(cpu), KVM_SET_XSAVE, xsave); return r; } @@ -1289,6 +1295,10 @@ static int kvm_get_xsave(X86CPU *cpu) env-xstate_bv = *(uint64_t *)xsave-region[XSAVE_XSTATE_BV]; memcpy(env-ymmh_regs, xsave-region[XSAVE_YMMH_SPACE], sizeof env-ymmh_regs); +memcpy(env-bnd_regs, xsave-region[XSAVE_BNDREGS], +sizeof env-bnd_regs); +memcpy(env-bndcs_regs, xsave-region[XSAVE_BNDCSR], +sizeof(env-bndcs_regs)); return 0; } diff --git a/target-i386/machine.c b/target-i386/machine.c index e568da2..ceab51b 100644 --- a/target-i386/machine.c +++ b/target-i386/machine.c @@ -63,6 +63,21 @@ static const VMStateDescription vmstate_ymmh_reg = { #define VMSTATE_YMMH_REGS_VARS(_field, _state, _n, _v) \ VMSTATE_STRUCT_ARRAY(_field, _state, _n, _v, vmstate_ymmh_reg, XMMReg) +static const VMStateDescription vmstate_bnd_regs = { +.name = bnd_regs, +.version_id = 1, +.minimum_version_id = 1, +.minimum_version_id_old = 1, +.fields = (VMStateField[]) { +VMSTATE_UINT64(lb, BNDReg), +VMSTATE_UINT64(ub, BNDReg), +VMSTATE_END_OF_LIST() +} +}; + +#define VMSTATE_BND_REGS(_field, _state, _n) \ +VMSTATE_STRUCT_ARRAY(_field, _state, _n, 0, vmstate_bnd_regs, BNDReg) + static const VMStateDescription vmstate_mtrr_var = { .name = mtrr_var, .version_id = 1
[PATCH v3 2/2] target-i386: MSR_IA32_BNDCFGS handle
From 12fa3564b7342c4e034b13671dc922ff23ac4b1e Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Sat, 7 Dec 2013 05:18:35 +0800 Subject: [PATCH v3 2/2] target-i386: MSR_IA32_BNDCFGS handle Signed-off-by: Liu Jinsong jinsong@intel.com --- target-i386/cpu.h |3 +++ target-i386/kvm.c | 14 ++ target-i386/machine.c |7 ++- 3 files changed, 23 insertions(+), 1 deletions(-) diff --git a/target-i386/cpu.h b/target-i386/cpu.h index c28b901..bbec228 100644 --- a/target-i386/cpu.h +++ b/target-i386/cpu.h @@ -380,6 +380,8 @@ #define MSR_VM_HSAVE_PA 0xc0010117 +#define MSR_IA32_BNDCFGS0x0d90 + #define XSTATE_FP (1ULL 0) #define XSTATE_SSE (1ULL 1) #define XSTATE_YMM (1ULL 2) @@ -928,6 +930,7 @@ typedef struct CPUX86State { XMMReg ymmh_regs[CPU_NB_REGS]; BNDReg bnd_regs[4]; BNDCSReg bndcs_regs; +uint64_t msr_bndcfgs; uint64_t xcr0; diff --git a/target-i386/kvm.c b/target-i386/kvm.c index ff913ff..01ebca2 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -69,6 +69,7 @@ static bool has_msr_feature_control; static bool has_msr_async_pf_en; static bool has_msr_pv_eoi_en; static bool has_msr_misc_enable; +static bool has_msr_bndcfgs; static bool has_msr_kvm_steal_time; static int lm_capable_kernel; @@ -772,6 +773,10 @@ static int kvm_get_supported_msrs(KVMState *s) has_msr_misc_enable = true; continue; } +if (kvm_msr_list-indices[i] == MSR_IA32_BNDCFGS) { +has_msr_bndcfgs = true; +continue; +} } } @@ -1214,6 +1219,9 @@ static int kvm_put_msrs(X86CPU *cpu, int level) kvm_msr_entry_set(msrs[n++], MSR_IA32_FEATURE_CONTROL, env-msr_ia32_feature_control); } +if (has_msr_bndcfgs) { +kvm_msr_entry_set(msrs[n++], MSR_IA32_BNDCFGS, env-msr_bndcfgs); +} } if (env-mcg_cap) { int i; @@ -1445,6 +1453,9 @@ static int kvm_get_msrs(X86CPU *cpu) if (has_msr_feature_control) { msrs[n++].index = MSR_IA32_FEATURE_CONTROL; } +if (has_msr_bndcfgs) { +msrs[n++].index = MSR_IA32_BNDCFGS; +} if (!env-tsc_valid) { msrs[n++].index = MSR_IA32_TSC; @@ -1560,6 +1571,9 @@ static int kvm_get_msrs(X86CPU *cpu) case MSR_IA32_FEATURE_CONTROL: env-msr_ia32_feature_control = msrs[i].data; break; +case MSR_IA32_BNDCFGS: +env-msr_bndcfgs = msrs[i].data; +break; default: if (msrs[i].index = MSR_MC0_CTL msrs[i].index MSR_MC0_CTL + (env-mcg_cap 0xff) * 4) { diff --git a/target-i386/machine.c b/target-i386/machine.c index ceab51b..2de1964 100644 --- a/target-i386/machine.c +++ b/target-i386/machine.c @@ -533,7 +533,11 @@ static bool mpx_needed(void *opaque) } } -return env-bndcs_regs.cfgu || env-bndcs_regs.sts; +if (env-bndcs_regs.cfgu || env-bndcs_regs.sts) { +return true; +} + +return !!env-msr_bndcfgs; } static const VMStateDescription vmstate_mpx = { @@ -545,6 +549,7 @@ static const VMStateDescription vmstate_mpx = { VMSTATE_BND_REGS(env.bnd_regs, X86CPU, 4), VMSTATE_UINT64(env.bndcs_regs.cfgu, X86CPU), VMSTATE_UINT64(env.bndcs_regs.sts, X86CPU), +VMSTATE_UINT64(env.msr_bndcfgs, X86CPU), VMSTATE_END_OF_LIST() } }; -- 1.7.1 0002-target-i386-MSR_IA32_BNDCFGS-handle.patch Description: 0002-target-i386-MSR_IA32_BNDCFGS-handle.patch
RE: [PATCH 3/4] KVM/X86: Intel MPX vmx and msr handle
Paolo Bonzini wrote: Il 02/12/2013 17:46, Liu, Jinsong ha scritto: From e9ba40b3d1820b8ab31431c73226ee3ed485edd1 Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Tue, 3 Dec 2013 07:02:27 +0800 Subject: [PATCH 3/4] KVM/X86: Intel MPX vmx and msr handle Signed-off-by: Xudong Hao xudong@intel.com Signed-off-by: Liu Jinsong jinsong@intel.com --- arch/x86/include/asm/vmx.h|2 ++ arch/x86/include/uapi/asm/msr-index.h |1 + arch/x86/kvm/vmx.c| 12 ++-- 3 files changed, 13 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h index 966502d..1bf4681 100644 --- a/arch/x86/include/asm/vmx.h +++ b/arch/x86/include/asm/vmx.h @@ -85,6 +85,7 @@ #define VM_EXIT_SAVE_IA32_EFER 0x0010 #define VM_EXIT_LOAD_IA32_EFER 0x0020 #define VM_EXIT_SAVE_VMX_PREEMPTION_TIMER 0x0040 +#define VM_EXIT_CLEAR_BNDCFGS 0x0080 #define VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR 0x00036dff @@ -95,6 +96,7 @@ #define VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL 0x2000 #define VM_ENTRY_LOAD_IA32_PAT 0x4000 #define VM_ENTRY_LOAD_IA32_EFER 0x8000 +#define VM_ENTRY_LOAD_BNDCFGS 0x0001 #define VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR 0x11ff diff --git a/arch/x86/include/uapi/asm/msr-index.h b/arch/x86/include/uapi/asm/msr-index.h index 37813b5..2a418c4 100644 --- a/arch/x86/include/uapi/asm/msr-index.h +++ b/arch/x86/include/uapi/asm/msr-index.h @@ -294,6 +294,7 @@ #define MSR_SMI_COUNT 0x0034 #define MSR_IA32_FEATURE_CONTROL0x003a #define MSR_IA32_TSC_ADJUST 0x003b +#define MSR_IA32_BNDCFGS0x0d90 #define FEATURE_CONTROL_LOCKED (10) #define FEATURE_CONTROL_VMXON_ENABLED_INSIDE_SMX(11) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index b2fe1c2..9a16e60 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -439,6 +439,7 @@ struct vcpu_vmx { #endif int gs_ldt_reload_needed; int fs_reload_needed; +u64 msr_host_bndcfgs; } host_state; struct { int vm86_active; @@ -1647,6 +1648,8 @@ static void vmx_save_host_state(struct kvm_vcpu *vcpu) if (is_long_mode(vmx-vcpu)) wrmsrl(MSR_KERNEL_GS_BASE, vmx-msr_guest_kernel_gs_base); #endif +if (cpu_has_mpx) +rdmsrl(MSR_IA32_BNDCFGS, vmx-host_state.msr_host_bndcfgs); for (i = 0; i vmx-save_nmsrs; ++i) kvm_set_shared_msr(vmx-guest_msrs[i].index, vmx-guest_msrs[i].data, @@ -1684,6 +1687,8 @@ static void __vmx_load_host_state(struct vcpu_vmx *vmx) #ifdef CONFIG_X86_64 wrmsrl(MSR_KERNEL_GS_BASE, vmx-msr_host_kernel_gs_base); #endif + if (vmx-host_state.msr_host_bndcfgs) + wrmsrl(MSR_IA32_BNDCFGS, vmx-host_state.msr_host_bndcfgs); /* * If the FPU is not active (through the host task or * the guest vcpu), then restore the cr0.TS bit. @@ -2800,7 +2805,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf) min |= VM_EXIT_HOST_ADDR_SPACE_SIZE; #endif opt = VM_EXIT_SAVE_IA32_PAT | VM_EXIT_LOAD_IA32_PAT | -VM_EXIT_ACK_INTR_ON_EXIT; + VM_EXIT_ACK_INTR_ON_EXIT | VM_EXIT_CLEAR_BNDCFGS; if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_EXIT_CTLS, _vmexit_control) 0) return -EIO; @@ -2817,7 +2822,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf) _pin_based_exec_control = ~PIN_BASED_POSTED_INTR; min = 0; -opt = VM_ENTRY_LOAD_IA32_PAT; +opt = VM_ENTRY_LOAD_IA32_PAT | VM_ENTRY_LOAD_BNDCFGS; if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_ENTRY_CTLS, _vmentry_control) 0) return -EIO; @@ -8636,6 +8641,9 @@ static int __init vmx_init(void) vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_CS, false); vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_ESP, false); vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_EIP, false); + if (cpu_has_mpx) + vmx_disable_intercept_for_msr(MSR_IA32_BNDCFGS, true); + memcpy(vmx_msr_bitmap_legacy_x2apic, vmx_msr_bitmap_legacy, PAGE_SIZE); memcpy(vmx_msr_bitmap_longmode_x2apic, This patch should also add BNDCFGS to msrs_to_save, in arch/x86/kvm/x86.c. Paolo Thanks! and, considering it also involves vmx_get/set_msr and it's distinct logic, I add a new patch for it. Jinsong-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [Qemu-devel] [PATCH v3 0/2] Intel MPX feature support at Qemu
Eric Blake wrote: On 12/06/2013 07:06 AM, Liu, Jinsong wrote: Intel has released Memory Protection Extensions (MPX) recently. Please refer to http://download-software.intel.com/sites/default/files/319433-015.pdf These 2 patches are version2 to support Intel MPX at qemu side. You still aren't threading correctly, which makes it hard to track your series. Please review http://wiki.qemu.org/Contribute/SubmitAPatch and make sure your 'git send-email' settings allow for proper threading; a good way to test that is to first send the patch series to yourself to ensure your environment is set up correctly. Thanks Blake! will take care and learn using git send-email when I send patches later (i.e. kvm mpx patches). Jinsong-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH v3 0/2] Intel MPX feature support at Qemu
Paolo Bonzini wrote: Il 06/12/2013 15:06, Liu, Jinsong ha scritto: Intel has released Memory Protection Extensions (MPX) recently. Please refer to http://download-software.intel.com/sites/default/files/319433-015.pdf These 2 patches are version2 to support Intel MPX at qemu side. Version 1: * Fix cpuid leaf 0x0d bug which incorrectly parsed eax and ebx; * Expose cpuid leaf (0xd, 3) and (0xd, 4) to guest; Version 2: * Add comments to explain cpuid error parse (of current qemu) didn't generate wrong result; * Add some MPX related definiation, and hardcode sizes and offsets of xsave features 3 and 4. It also add corresponding part to kvm_get/put_xsave. Version 3: * patch v2 1/2 (bug fix) has been checked in qemu; * add vmstate for migration; * add 1 new patch for bndcfgs msr; Thanks, Jinsong-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html I'm going to squash and apply the patches, but please please learn to thread the messages correctly. You're also sending two copies of the patch, one in the body and one in the attachment. Paolo Yup, Blake also remind me threading. I will take care when sending kvm side patches. Thanks, Jinsong-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [Qemu-devel] [PATCH v2 3/3] X86, mpx: Intel MPX xstate feature definition
Paolo Bonzini wrote: Il 07/12/2013 01:20, Qiaowei Ren ha scritto: This patch defines xstate feature and extends struct xsave_hdr_struct to support Intel MPX. Signed-off-by: Qiaowei Ren qiaowei@intel.com Signed-off-by: Xudong Hao xudong@intel.com Signed-off-by: Liu Jinsong jinsong@intel.com --- arch/x86/include/asm/processor.h | 12 arch/x86/include/asm/xsave.h |5 - 2 files changed, 16 insertions(+), 1 deletions(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 987c75e..2fe2e75 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -370,6 +370,15 @@ struct ymmh_struct { u32 ymmh_space[64]; }; +struct bndregs_struct { +u64 bndregs[8]; +} __packed; + +struct bndcsr_struct { +u64 cfg_reg_u; +u64 status_reg; +} __packed; + struct xsave_hdr_struct { u64 xstate_bv; u64 reserved1[2]; @@ -380,6 +389,9 @@ struct xsave_struct { struct i387_fxsave_struct i387; struct xsave_hdr_struct xsave_hdr; struct ymmh_struct ymmh; +u8 lwp_area[128]; Sorry for the back-and-forth, but I think this and the removal of XSTATE_FLEXIBLE (perhaps XSTATE_LAZY?) makes your v2 worse than v1. Since Peter already said the same, please undo these changes. Also, how is XSTATE_EAGER used? Should MPX be disabled when xsaveopt is disabled on the kernel command line? (Liu, how would this affect the KVM patches, too?) Paolo Currently seems no, and if needed we can add a new patch at kvm side accordingly when native mpx patches checked in. Jinsong +#define XSTATE_EAGER(XSTATE_BNDREGS | XSTATE_BNDCSR) /* * These are the features that the OS can handle currently. */ -#define XCNTXT_MASK (XSTATE_FP | XSTATE_SSE | XSTATE_YMM) +#define XCNTXT_MASK(XSTATE_FP | XSTATE_SSE | XSTATE_YMM | XSTATE_EAGER) #ifdef CONFIG_X86_64 #define REX_PREFIX 0x48, -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [Qemu-devel] [PATCH v2 3/3] X86, mpx: Intel MPX xstate feature definition
H. Peter Anvin wrote: On 12/06/2013 12:05 PM, Liu, Jinsong wrote: Since Peter already said the same, please undo these changes. Also, how is XSTATE_EAGER used? Should MPX be disabled when xsaveopt is disabled on the kernel command line? (Liu, how would this affect the KVM patches, too?) Paolo Currently seems no, and if needed we can add a new patch at kvm side accordingly when native mpx patches checked in. We need to either disable these features in lazy mode, or we need to force eager mode if these features are to be supported. The problem with the latter is that it means forcing eager mode regardless of if anything actually *uses* these features. A third option would be to require applications to use a prctl() or similar to enable eager-save features. Thoughts? -hpa The third option seems better -- how does native mpx patches work, force eager? Thanks, Jinsong-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH v2 2/2] target-i386: Intel MPX
Paolo Bonzini wrote: Il 04/12/2013 12:30, Liu, Jinsong ha scritto: Almost there. Migration (vmstate) is still missing. Like this: == From faead85c0dbe62da896e0ed9e165d98e10216968 Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Wed, 4 Dec 2013 16:56:49 +0800 Subject: [PATCH 2/2] target-i386: Intel MPX Add some MPX related definiation, and hardcode sizes and offsets of xsave features 3 and 4. It also add corresponding part to kvm_get/put_xsave, and vmstate. Signed-off-by: Liu Jinsong jinsong@intel.com --- target-i386/cpu.c |4 target-i386/cpu.h | 22 +++--- target-i386/kvm.c | 10 ++ target-i386/machine.c | 32 4 files changed, 65 insertions(+), 3 deletions(-) diff --git a/target-i386/cpu.c b/target-i386/cpu.c index 544b57f..52ca029 100644 --- a/target-i386/cpu.c +++ b/target-i386/cpu.c @@ -336,6 +336,10 @@ typedef struct ExtSaveArea { static const ExtSaveArea ext_save_areas[] = { [2] = { .feature = FEAT_1_ECX, .bits = CPUID_EXT_AVX, .offset = 0x240, .size = 0x100 }, +[3] = { .feature = FEAT_7_0_EBX, .bits = CPUID_7_0_EBX_MPX, +.offset = 0x3c0, .size = 0x40 }, +[4] = { .feature = FEAT_7_0_EBX, .bits = CPUID_7_0_EBX_MPX, +.offset = 0x400, .size = 0x10 }, }; const char *get_register_name_32(unsigned int reg) diff --git a/target-i386/cpu.h b/target-i386/cpu.h index ea373e8..5c1dd17 100644 --- a/target-i386/cpu.h +++ b/target-i386/cpu.h @@ -380,9 +380,12 @@ #define MSR_VM_HSAVE_PA 0xc0010117 -#define XSTATE_FP 1 -#define XSTATE_SSE 2 -#define XSTATE_YMM 4 +#define XSTATE_FP (1ULL 0) +#define XSTATE_SSE (1ULL 1) +#define XSTATE_YMM (1ULL 2) +#define XSTATE_BNDREGS (1ULL 3) +#define XSTATE_BNDCSR (1ULL 4) + /* CPUID feature words */ typedef enum FeatureWord { @@ -545,6 +548,7 @@ typedef uint32_t FeatureWordArray[FEATURE_WORDS]; #define CPUID_7_0_EBX_ERMS (1 9) #define CPUID_7_0_EBX_INVPCID (1 10) #define CPUID_7_0_EBX_RTM (1 11) +#define CPUID_7_0_EBX_MPX (1 14) #define CPUID_7_0_EBX_RDSEED (1 18) #define CPUID_7_0_EBX_ADX (1 19) #define CPUID_7_0_EBX_SMAP (1 20) @@ -695,6 +699,16 @@ typedef union { uint64_t q; } MMXReg; +typedef struct BNDReg { +uint64_t lb; +uint64_t ub; +} BNDReg; + +typedef struct BNDCSReg { +uint64_t cfg; +uint64_t sts; +} BNDCSReg; + #ifdef HOST_WORDS_BIGENDIAN #define XMM_B(n) _b[15 - (n)] #define XMM_W(n) _w[7 - (n)] @@ -912,6 +926,8 @@ typedef struct CPUX86State { uint64_t xstate_bv; XMMReg ymmh_regs[CPU_NB_REGS]; +BNDReg bnd_regs[4]; +BNDCSReg bndcs_regs; uint64_t xcr0; diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 749aa09..347d3d3 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -980,6 +980,8 @@ static int kvm_put_fpu(X86CPU *cpu) #define XSAVE_XMM_SPACE 40 #define XSAVE_XSTATE_BV 128 #define XSAVE_YMMH_SPACE 144 +#define XSAVE_BNDREGS 240 +#define XSAVE_BNDCSR 256 static int kvm_put_xsave(X86CPU *cpu) { @@ -1012,6 +1014,10 @@ static int kvm_put_xsave(X86CPU *cpu) *(uint64_t *)xsave-region[XSAVE_XSTATE_BV] = env-xstate_bv; memcpy(xsave-region[XSAVE_YMMH_SPACE], env-ymmh_regs, sizeof env-ymmh_regs); +memcpy(xsave-region[XSAVE_BNDREGS], env-bnd_regs, +sizeof env-bnd_regs); +memcpy(xsave-region[XSAVE_BNDCSR], env-bndcs_regs, +sizeof(env-bndcs_regs)); r = kvm_vcpu_ioctl(CPU(cpu), KVM_SET_XSAVE, xsave); return r; } @@ -1294,6 +1300,10 @@ static int kvm_get_xsave(X86CPU *cpu) env-xstate_bv = *(uint64_t *)xsave-region[XSAVE_XSTATE_BV]; memcpy(env-ymmh_regs, xsave-region[XSAVE_YMMH_SPACE], sizeof env-ymmh_regs); +memcpy(env-bnd_regs, xsave-region[XSAVE_BNDREGS], +sizeof env-bnd_regs); +memcpy(env-bndcs_regs, xsave-region[XSAVE_BNDCSR], +sizeof(env-bndcs_regs)); return 0; } diff --git a/target-i386/machine.c b/target-i386/machine.c index e568da2..ca8be7d 100644 --- a/target-i386/machine.c +++ b/target-i386/machine.c @@ -63,6 +63,36 @@ static const VMStateDescription vmstate_ymmh_reg = { #define VMSTATE_YMMH_REGS_VARS(_field, _state, _n, _v) \ VMSTATE_STRUCT_ARRAY(_field, _state, _n, _v, vmstate_ymmh_reg, XMMReg) +static const VMStateDescription vmstate_bnd_regs = { +.name = bnd_regs, +.version_id = 1, +.minimum_version_id = 1, +.minimum_version_id_old = 1, +.fields = (VMStateField []) { +VMSTATE_UINT64(lb, BNDReg), +VMSTATE_UINT64(ub, BNDReg
RE: [PATCH v2 2/2] target-i386: Intel MPX
Almost there. Migration (vmstate) is still missing. Like this: == From faead85c0dbe62da896e0ed9e165d98e10216968 Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Wed, 4 Dec 2013 16:56:49 +0800 Subject: [PATCH 2/2] target-i386: Intel MPX Add some MPX related definiation, and hardcode sizes and offsets of xsave features 3 and 4. It also add corresponding part to kvm_get/put_xsave, and vmstate. Signed-off-by: Liu Jinsong jinsong@intel.com --- target-i386/cpu.c |4 target-i386/cpu.h | 22 +++--- target-i386/kvm.c | 10 ++ target-i386/machine.c | 32 4 files changed, 65 insertions(+), 3 deletions(-) diff --git a/target-i386/cpu.c b/target-i386/cpu.c index 544b57f..52ca029 100644 --- a/target-i386/cpu.c +++ b/target-i386/cpu.c @@ -336,6 +336,10 @@ typedef struct ExtSaveArea { static const ExtSaveArea ext_save_areas[] = { [2] = { .feature = FEAT_1_ECX, .bits = CPUID_EXT_AVX, .offset = 0x240, .size = 0x100 }, +[3] = { .feature = FEAT_7_0_EBX, .bits = CPUID_7_0_EBX_MPX, +.offset = 0x3c0, .size = 0x40 }, +[4] = { .feature = FEAT_7_0_EBX, .bits = CPUID_7_0_EBX_MPX, +.offset = 0x400, .size = 0x10 }, }; const char *get_register_name_32(unsigned int reg) diff --git a/target-i386/cpu.h b/target-i386/cpu.h index ea373e8..5c1dd17 100644 --- a/target-i386/cpu.h +++ b/target-i386/cpu.h @@ -380,9 +380,12 @@ #define MSR_VM_HSAVE_PA 0xc0010117 -#define XSTATE_FP 1 -#define XSTATE_SSE 2 -#define XSTATE_YMM 4 +#define XSTATE_FP (1ULL 0) +#define XSTATE_SSE (1ULL 1) +#define XSTATE_YMM (1ULL 2) +#define XSTATE_BNDREGS (1ULL 3) +#define XSTATE_BNDCSR (1ULL 4) + /* CPUID feature words */ typedef enum FeatureWord { @@ -545,6 +548,7 @@ typedef uint32_t FeatureWordArray[FEATURE_WORDS]; #define CPUID_7_0_EBX_ERMS (1 9) #define CPUID_7_0_EBX_INVPCID (1 10) #define CPUID_7_0_EBX_RTM (1 11) +#define CPUID_7_0_EBX_MPX (1 14) #define CPUID_7_0_EBX_RDSEED (1 18) #define CPUID_7_0_EBX_ADX (1 19) #define CPUID_7_0_EBX_SMAP (1 20) @@ -695,6 +699,16 @@ typedef union { uint64_t q; } MMXReg; +typedef struct BNDReg { +uint64_t lb; +uint64_t ub; +} BNDReg; + +typedef struct BNDCSReg { +uint64_t cfg; +uint64_t sts; +} BNDCSReg; + #ifdef HOST_WORDS_BIGENDIAN #define XMM_B(n) _b[15 - (n)] #define XMM_W(n) _w[7 - (n)] @@ -912,6 +926,8 @@ typedef struct CPUX86State { uint64_t xstate_bv; XMMReg ymmh_regs[CPU_NB_REGS]; +BNDReg bnd_regs[4]; +BNDCSReg bndcs_regs; uint64_t xcr0; diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 749aa09..347d3d3 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -980,6 +980,8 @@ static int kvm_put_fpu(X86CPU *cpu) #define XSAVE_XMM_SPACE 40 #define XSAVE_XSTATE_BV 128 #define XSAVE_YMMH_SPACE 144 +#define XSAVE_BNDREGS 240 +#define XSAVE_BNDCSR 256 static int kvm_put_xsave(X86CPU *cpu) { @@ -1012,6 +1014,10 @@ static int kvm_put_xsave(X86CPU *cpu) *(uint64_t *)xsave-region[XSAVE_XSTATE_BV] = env-xstate_bv; memcpy(xsave-region[XSAVE_YMMH_SPACE], env-ymmh_regs, sizeof env-ymmh_regs); +memcpy(xsave-region[XSAVE_BNDREGS], env-bnd_regs, +sizeof env-bnd_regs); +memcpy(xsave-region[XSAVE_BNDCSR], env-bndcs_regs, +sizeof(env-bndcs_regs)); r = kvm_vcpu_ioctl(CPU(cpu), KVM_SET_XSAVE, xsave); return r; } @@ -1294,6 +1300,10 @@ static int kvm_get_xsave(X86CPU *cpu) env-xstate_bv = *(uint64_t *)xsave-region[XSAVE_XSTATE_BV]; memcpy(env-ymmh_regs, xsave-region[XSAVE_YMMH_SPACE], sizeof env-ymmh_regs); +memcpy(env-bnd_regs, xsave-region[XSAVE_BNDREGS], +sizeof env-bnd_regs); +memcpy(env-bndcs_regs, xsave-region[XSAVE_BNDCSR], +sizeof(env-bndcs_regs)); return 0; } diff --git a/target-i386/machine.c b/target-i386/machine.c index e568da2..ca8be7d 100644 --- a/target-i386/machine.c +++ b/target-i386/machine.c @@ -63,6 +63,36 @@ static const VMStateDescription vmstate_ymmh_reg = { #define VMSTATE_YMMH_REGS_VARS(_field, _state, _n, _v) \ VMSTATE_STRUCT_ARRAY(_field, _state, _n, _v, vmstate_ymmh_reg, XMMReg) +static const VMStateDescription vmstate_bnd_regs = { +.name = bnd_regs, +.version_id = 1, +.minimum_version_id = 1, +.minimum_version_id_old = 1, +.fields = (VMStateField []) { +VMSTATE_UINT64(lb, BNDReg), +VMSTATE_UINT64(ub, BNDReg), +VMSTATE_END_OF_LIST() +} +}; + +#define VMSTATE_BNDREG_VARS(_field, _state, _n, _v) \ +VMSTATE_STRUCT_ARRAY(_field, _state, _n, _v, vmstate_bnd_regs, BNDReg) + +static const
RE: [PATCH 2/2] target-i386: Intel MPX
Paolo Bonzini wrote: Il 02/12/2013 17:42, Liu, Jinsong ha scritto: From 1a199d68265ffeb0234530f29d92a00a5edeff75 Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Tue, 3 Dec 2013 05:08:19 +0800 Subject: [PATCH 2/2] target-i386: Intel MPX Add some MPX related definiation, and hardcode sizes and offsets of xsave features 3 and 4. Signed-off-by: Liu Jinsong jinsong@intel.com kvm_get/put_xsave support is still missing. Thanks! Will add kvm_get/put_xsave support and send out later. Jinsong --- target-i386/cpu.c |4 target-i386/cpu.h | 10 +++--- 2 files changed, 11 insertions(+), 3 deletions(-) diff --git a/target-i386/cpu.c b/target-i386/cpu.c index 544b57f..52ca029 100644 --- a/target-i386/cpu.c +++ b/target-i386/cpu.c @@ -336,6 +336,10 @@ typedef struct ExtSaveArea { static const ExtSaveArea ext_save_areas[] = { [2] = { .feature = FEAT_1_ECX, .bits = CPUID_EXT_AVX, .offset = 0x240, .size = 0x100 }, +[3] = { .feature = FEAT_7_0_EBX, .bits = CPUID_7_0_EBX_MPX, +.offset = 0x3c0, .size = 0x40 }, +[4] = { .feature = FEAT_7_0_EBX, .bits = CPUID_7_0_EBX_MPX, +.offset = 0x400, .size = 0x10 }, }; const char *get_register_name_32(unsigned int reg) diff --git a/target-i386/cpu.h b/target-i386/cpu.h index ea373e8..2975644 100644 --- a/target-i386/cpu.h +++ b/target-i386/cpu.h @@ -380,9 +380,12 @@ #define MSR_VM_HSAVE_PA 0xc0010117 -#define XSTATE_FP 1 -#define XSTATE_SSE 2 -#define XSTATE_YMM 4 +#define XSTATE_FP (1ULL 0) +#define XSTATE_SSE (1ULL 1) +#define XSTATE_YMM (1ULL 2) +#define XSTATE_BNDREGS (1ULL 3) +#define XSTATE_BNDCSR (1ULL 4) + /* CPUID feature words */ typedef enum FeatureWord { @@ -545,6 +548,7 @@ typedef uint32_t FeatureWordArray[FEATURE_WORDS]; #define CPUID_7_0_EBX_ERMS (1 9) #define CPUID_7_0_EBX_INVPCID (1 10) #define CPUID_7_0_EBX_RTM (1 11) +#define CPUID_7_0_EBX_MPX (1 14) #define CPUID_7_0_EBX_RDSEED (1 18) #define CPUID_7_0_EBX_ADX (1 19) #define CPUID_7_0_EBX_SMAP (1 20) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 0/2]
Intel has released Memory Protection Extensions (MPX) recently. Please refer to http://download-software.intel.com/sites/default/files/319433-015.pdf These 2 patches are version2 to support Intel MPX at qemu side. Version 1: * Fix cpuid leaf 0x0d bug which incorrectly parsed eax and ebx; * Expose cpuid leaf (0xd, 3) and (0xd, 4) to guest; Version 2: * Add comments to explain cpuid error parse (of current qemu) didn't generate wrong result; * Add some MPX related definiation, and hardcode sizes and offsets of xsave features 3 and 4. It also add corresponding part to kvm_get/put_xsave. Thanks, Jinsong-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 1/2] target-i386: fix cpuid leaf 0x0d
From cb3b12dd9873929b3a03214e3aa0ee5297e75119 Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Tue, 3 Dec 2013 04:17:50 +0800 Subject: [PATCH v2 1/2] target-i386: fix cpuid leaf 0x0d Fix cpuid leaf 0x0d which incorrectly parsed eax and ebx. However, before this patch the CPUID worked fine -- the .offset field contained the size _and_ was stored in the register that is supposed to hold the size (eax), and likewise the .size field contained the offset _and_ was stored in the register trhat is supposed to hold the offset (ebx). Signed-off-by: Liu Jinsong jinsong@intel.com --- target-i386/cpu.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/target-i386/cpu.c b/target-i386/cpu.c index 864c80e..544b57f 100644 --- a/target-i386/cpu.c +++ b/target-i386/cpu.c @@ -335,7 +335,7 @@ typedef struct ExtSaveArea { static const ExtSaveArea ext_save_areas[] = { [2] = { .feature = FEAT_1_ECX, .bits = CPUID_EXT_AVX, -.offset = 0x100, .size = 0x240 }, +.offset = 0x240, .size = 0x100 }, }; const char *get_register_name_32(unsigned int reg) @@ -2225,8 +2225,8 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, const ExtSaveArea *esa = ext_save_areas[count]; if ((env-features[esa-feature] esa-bits) == esa-bits (kvm_mask (1 count)) != 0) { -*eax = esa-offset; -*ebx = esa-size; +*eax = esa-size; +*ebx = esa-offset; } } break; -- 1.7.1 0001-target-i386-fix-cpuid-leaf-0x0d.patch Description: 0001-target-i386-fix-cpuid-leaf-0x0d.patch
[PATCH v2 2/2] target-i386: Intel MPX
From 256484fd75d4eb4d248e5e0f493f16182da59dc2 Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Wed, 4 Dec 2013 16:56:49 +0800 Subject: [PATCH v2 2/2] target-i386: Intel MPX Add some MPX related definiation, and hardcode sizes and offsets of xsave features 3 and 4. It also add corresponding part to kvm_get/put_xsave. Signed-off-by: Liu Jinsong jinsong@intel.com --- target-i386/cpu.c |4 target-i386/cpu.h | 24 +--- target-i386/kvm.c | 10 ++ 3 files changed, 35 insertions(+), 3 deletions(-) diff --git a/target-i386/cpu.c b/target-i386/cpu.c index 544b57f..52ca029 100644 --- a/target-i386/cpu.c +++ b/target-i386/cpu.c @@ -336,6 +336,10 @@ typedef struct ExtSaveArea { static const ExtSaveArea ext_save_areas[] = { [2] = { .feature = FEAT_1_ECX, .bits = CPUID_EXT_AVX, .offset = 0x240, .size = 0x100 }, +[3] = { .feature = FEAT_7_0_EBX, .bits = CPUID_7_0_EBX_MPX, +.offset = 0x3c0, .size = 0x40 }, +[4] = { .feature = FEAT_7_0_EBX, .bits = CPUID_7_0_EBX_MPX, +.offset = 0x400, .size = 0x10 }, }; const char *get_register_name_32(unsigned int reg) diff --git a/target-i386/cpu.h b/target-i386/cpu.h index ea373e8..4020591 100644 --- a/target-i386/cpu.h +++ b/target-i386/cpu.h @@ -380,9 +380,12 @@ #define MSR_VM_HSAVE_PA 0xc0010117 -#define XSTATE_FP 1 -#define XSTATE_SSE 2 -#define XSTATE_YMM 4 +#define XSTATE_FP (1ULL 0) +#define XSTATE_SSE (1ULL 1) +#define XSTATE_YMM (1ULL 2) +#define XSTATE_BNDREGS (1ULL 3) +#define XSTATE_BNDCSR (1ULL 4) + /* CPUID feature words */ typedef enum FeatureWord { @@ -545,6 +548,7 @@ typedef uint32_t FeatureWordArray[FEATURE_WORDS]; #define CPUID_7_0_EBX_ERMS (1 9) #define CPUID_7_0_EBX_INVPCID (1 10) #define CPUID_7_0_EBX_RTM (1 11) +#define CPUID_7_0_EBX_MPX (1 14) #define CPUID_7_0_EBX_RDSEED (1 18) #define CPUID_7_0_EBX_ADX (1 19) #define CPUID_7_0_EBX_SMAP (1 20) @@ -695,6 +699,18 @@ typedef union { uint64_t q; } MMXReg; +typedef struct BNDReg { +uint64_t lb; +uint64_t ub; +} BNDReg; + +typedef struct BNDCSReg { +uint64_t cfg; +uint64_t pad; +uint64_t sts_lo; +uint64_t sts_hi; +} BNDCSReg; + #ifdef HOST_WORDS_BIGENDIAN #define XMM_B(n) _b[15 - (n)] #define XMM_W(n) _w[7 - (n)] @@ -912,6 +928,8 @@ typedef struct CPUX86State { uint64_t xstate_bv; XMMReg ymmh_regs[CPU_NB_REGS]; +BNDReg bnd_regs[4]; +BNDCSReg bndcs_regs; uint64_t xcr0; diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 749aa09..347d3d3 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -980,6 +980,8 @@ static int kvm_put_fpu(X86CPU *cpu) #define XSAVE_XMM_SPACE 40 #define XSAVE_XSTATE_BV 128 #define XSAVE_YMMH_SPACE 144 +#define XSAVE_BNDREGS 240 +#define XSAVE_BNDCSR 256 static int kvm_put_xsave(X86CPU *cpu) { @@ -1012,6 +1014,10 @@ static int kvm_put_xsave(X86CPU *cpu) *(uint64_t *)xsave-region[XSAVE_XSTATE_BV] = env-xstate_bv; memcpy(xsave-region[XSAVE_YMMH_SPACE], env-ymmh_regs, sizeof env-ymmh_regs); +memcpy(xsave-region[XSAVE_BNDREGS], env-bnd_regs, +sizeof env-bnd_regs); +memcpy(xsave-region[XSAVE_BNDCSR], env-bndcs_regs, +sizeof(env-bndcs_regs)); r = kvm_vcpu_ioctl(CPU(cpu), KVM_SET_XSAVE, xsave); return r; } @@ -1294,6 +1300,10 @@ static int kvm_get_xsave(X86CPU *cpu) env-xstate_bv = *(uint64_t *)xsave-region[XSAVE_XSTATE_BV]; memcpy(env-ymmh_regs, xsave-region[XSAVE_YMMH_SPACE], sizeof env-ymmh_regs); +memcpy(env-bnd_regs, xsave-region[XSAVE_BNDREGS], +sizeof env-bnd_regs); +memcpy(env-bndcs_regs, xsave-region[XSAVE_BNDCSR], +sizeof(env-bndcs_regs)); return 0; } -- 1.7.1 0002-target-i386-Intel-MPX.patch Description: 0002-target-i386-Intel-MPX.patch
[PATCH 1/2] target-i386: fix cpuid leaf 0x0d
From 57751d87392d7ee9df5698bc83b356de654453ef Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Tue, 3 Dec 2013 04:17:50 +0800 Subject: [PATCH 1/2] target-i386: fix cpuid leaf 0x0d Fix cpuid leaf 0x0d which incorrectly parsed eax and ebx. However, before this patch the CPUID worked fine -- the sum of offset and size is same. Signed-off-by: Liu Jinsong jinsong@intel.com --- target-i386/cpu.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/target-i386/cpu.c b/target-i386/cpu.c index 864c80e..544b57f 100644 --- a/target-i386/cpu.c +++ b/target-i386/cpu.c @@ -335,7 +335,7 @@ typedef struct ExtSaveArea { static const ExtSaveArea ext_save_areas[] = { [2] = { .feature = FEAT_1_ECX, .bits = CPUID_EXT_AVX, -.offset = 0x100, .size = 0x240 }, +.offset = 0x240, .size = 0x100 }, }; const char *get_register_name_32(unsigned int reg) @@ -2225,8 +2225,8 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, const ExtSaveArea *esa = ext_save_areas[count]; if ((env-features[esa-feature] esa-bits) == esa-bits (kvm_mask (1 count)) != 0) { -*eax = esa-offset; -*ebx = esa-size; +*eax = esa-size; +*ebx = esa-offset; } } break; -- 1.7.1 0001-target-i386-fix-cpuid-leaf-0x0d.patch Description: 0001-target-i386-fix-cpuid-leaf-0x0d.patch
[PATCH 2/2] target-i386: Intel MPX
From 1a199d68265ffeb0234530f29d92a00a5edeff75 Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Tue, 3 Dec 2013 05:08:19 +0800 Subject: [PATCH 2/2] target-i386: Intel MPX Add some MPX related definiation, and hardcode sizes and offsets of xsave features 3 and 4. Signed-off-by: Liu Jinsong jinsong@intel.com --- target-i386/cpu.c |4 target-i386/cpu.h | 10 +++--- 2 files changed, 11 insertions(+), 3 deletions(-) diff --git a/target-i386/cpu.c b/target-i386/cpu.c index 544b57f..52ca029 100644 --- a/target-i386/cpu.c +++ b/target-i386/cpu.c @@ -336,6 +336,10 @@ typedef struct ExtSaveArea { static const ExtSaveArea ext_save_areas[] = { [2] = { .feature = FEAT_1_ECX, .bits = CPUID_EXT_AVX, .offset = 0x240, .size = 0x100 }, +[3] = { .feature = FEAT_7_0_EBX, .bits = CPUID_7_0_EBX_MPX, +.offset = 0x3c0, .size = 0x40 }, +[4] = { .feature = FEAT_7_0_EBX, .bits = CPUID_7_0_EBX_MPX, +.offset = 0x400, .size = 0x10 }, }; const char *get_register_name_32(unsigned int reg) diff --git a/target-i386/cpu.h b/target-i386/cpu.h index ea373e8..2975644 100644 --- a/target-i386/cpu.h +++ b/target-i386/cpu.h @@ -380,9 +380,12 @@ #define MSR_VM_HSAVE_PA 0xc0010117 -#define XSTATE_FP 1 -#define XSTATE_SSE 2 -#define XSTATE_YMM 4 +#define XSTATE_FP (1ULL 0) +#define XSTATE_SSE (1ULL 1) +#define XSTATE_YMM (1ULL 2) +#define XSTATE_BNDREGS (1ULL 3) +#define XSTATE_BNDCSR (1ULL 4) + /* CPUID feature words */ typedef enum FeatureWord { @@ -545,6 +548,7 @@ typedef uint32_t FeatureWordArray[FEATURE_WORDS]; #define CPUID_7_0_EBX_ERMS (1 9) #define CPUID_7_0_EBX_INVPCID (1 10) #define CPUID_7_0_EBX_RTM (1 11) +#define CPUID_7_0_EBX_MPX (1 14) #define CPUID_7_0_EBX_RDSEED (1 18) #define CPUID_7_0_EBX_ADX (1 19) #define CPUID_7_0_EBX_SMAP (1 20) -- 1.7.1 0002-target-i386-Intel-MPX.patch Description: 0002-target-i386-Intel-MPX.patch
[PATCH 1/4] X86: Intel MPX definiation
From fbfa537f690eca139a96c6b2636ab5130bf57716 Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Fri, 29 Nov 2013 01:27:00 +0800 Subject: [PATCH 1/4] X86: Intel MPX definiation Signed-off-by: Xudong Hao xudong@intel.com Signed-off-by: Liu Jinsong jinsong@intel.com --- arch/x86/include/asm/cpufeature.h |2 ++ arch/x86/include/asm/xsave.h |5 - 2 files changed, 6 insertions(+), 1 deletions(-) diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index 89270b4..1b00b01 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -216,6 +216,7 @@ #define X86_FEATURE_ERMS (9*32+ 9) /* Enhanced REP MOVSB/STOSB */ #define X86_FEATURE_INVPCID(9*32+10) /* Invalidate Processor Context ID */ #define X86_FEATURE_RTM(9*32+11) /* Restricted Transactional Memory */ +#define X86_FEATURE_MPX(9*32+14) /* Memory Protection Extension */ #define X86_FEATURE_RDSEED (9*32+18) /* The RDSEED instruction */ #define X86_FEATURE_ADX(9*32+19) /* The ADCX and ADOX instructions */ #define X86_FEATURE_SMAP (9*32+20) /* Supervisor Mode Access Prevention */ @@ -330,6 +331,7 @@ extern const char * const x86_power_flags[32]; #define cpu_has_perfctr_l2 boot_cpu_has(X86_FEATURE_PERFCTR_L2) #define cpu_has_cx8boot_cpu_has(X86_FEATURE_CX8) #define cpu_has_cx16 boot_cpu_has(X86_FEATURE_CX16) +#define cpu_has_mpxboot_cpu_has(X86_FEATURE_MPX) #define cpu_has_eager_fpu boot_cpu_has(X86_FEATURE_EAGER_FPU) #define cpu_has_topoextboot_cpu_has(X86_FEATURE_TOPOEXT) diff --git a/arch/x86/include/asm/xsave.h b/arch/x86/include/asm/xsave.h index 0415cda..d3e3ea5 100644 --- a/arch/x86/include/asm/xsave.h +++ b/arch/x86/include/asm/xsave.h @@ -9,6 +9,8 @@ #define XSTATE_FP 0x1 #define XSTATE_SSE 0x2 #define XSTATE_YMM 0x4 +#define XSTATE_BNDREGS 0x8 +#define XSTATE_BNDCSR 0x10 #define XSTATE_FPSSE (XSTATE_FP | XSTATE_SSE) @@ -23,7 +25,8 @@ /* * These are the features that the OS can handle currently. */ -#define XCNTXT_MASK(XSTATE_FP | XSTATE_SSE | XSTATE_YMM) +#define XCNTXT_MASK(XSTATE_FP | XSTATE_SSE | XSTATE_YMM | \ + XSTATE_BNDREGS | XSTATE_BNDCSR) #ifdef CONFIG_X86_64 #define REX_PREFIX 0x48, -- 1.7.1 0001-X86-Intel-MPX-definiation.patch Description: 0001-X86-Intel-MPX-definiation.patch
[PATCH 2/4] KVM/X86: Fix xsave cpuid exposing bug
From 4a2eb0a8467b4f278e59d2df209a1bc03349d088 Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Tue, 3 Dec 2013 06:28:20 +0800 Subject: [PATCH 2/4] KVM/X86: Fix xsave cpuid exposing bug EBX of cpuid(0xD, 0) is dynamic per XCR0 features enable/disable. Bit 63 of XCR0 is reserved for future expansion. Signed-off-by: Liu Jinsong jinsong@intel.com --- arch/x86/include/asm/xsave.h |2 ++ arch/x86/kvm/cpuid.c |6 +++--- arch/x86/kvm/x86.c |7 +-- 3 files changed, 10 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/xsave.h b/arch/x86/include/asm/xsave.h index d3e3ea5..6120e74 100644 --- a/arch/x86/include/asm/xsave.h +++ b/arch/x86/include/asm/xsave.h @@ -13,6 +13,8 @@ #define XSTATE_BNDCSR 0x10 #define XSTATE_FPSSE (XSTATE_FP | XSTATE_SSE) +/* Bit 63 of XCR0 is reserved for future expansion */ +#define XSTATE_EXTEND_MASK (~(XSTATE_FPSSE | (1 63))) #define FXSAVE_SIZE512 diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index c697625..2d661e6 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -28,7 +28,7 @@ static u32 xstate_required_size(u64 xstate_bv) int feature_bit = 0; u32 ret = XSAVE_HDR_SIZE + XSAVE_HDR_OFFSET; - xstate_bv = ~XSTATE_FPSSE; + xstate_bv = XSTATE_EXTEND_MASK; while (xstate_bv) { if (xstate_bv 0x1) { u32 eax, ebx, ecx, edx; @@ -74,8 +74,8 @@ void kvm_update_cpuid(struct kvm_vcpu *vcpu) vcpu-arch.guest_supported_xcr0 = (best-eax | ((u64)best-edx 32)) host_xcr0 KVM_SUPPORTED_XCR0; - vcpu-arch.guest_xstate_size = - xstate_required_size(vcpu-arch.guest_supported_xcr0); + vcpu-arch.guest_xstate_size = best-ebx = + xstate_required_size(vcpu-arch.xcr0); } kvm_pmu_cpuid_update(vcpu); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 21ef1ba..1657ca2 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -576,13 +576,13 @@ static void kvm_put_guest_xcr0(struct kvm_vcpu *vcpu) int __kvm_set_xcr(struct kvm_vcpu *vcpu, u32 index, u64 xcr) { - u64 xcr0; + u64 xcr0 = xcr; + u64 old_xcr0 = vcpu-arch.xcr0; u64 valid_bits; /* Only support XCR_XFEATURE_ENABLED_MASK(xcr0) now */ if (index != XCR_XFEATURE_ENABLED_MASK) return 1; - xcr0 = xcr; if (!(xcr0 XSTATE_FP)) return 1; if ((xcr0 XSTATE_YMM) !(xcr0 XSTATE_SSE)) @@ -599,6 +599,9 @@ int __kvm_set_xcr(struct kvm_vcpu *vcpu, u32 index, u64 xcr) kvm_put_guest_xcr0(vcpu); vcpu-arch.xcr0 = xcr0; + + if ((xcr0 ^ old_xcr0) XSTATE_EXTEND_MASK) + kvm_update_cpuid(vcpu); return 0; } -- 1.7.1 0002-KVM-X86-Fix-xsave-cpuid-exposing-bug.patch Description: 0002-KVM-X86-Fix-xsave-cpuid-exposing-bug.patch
[PATCH 3/4] KVM/X86: Intel MPX vmx and msr handle
From e9ba40b3d1820b8ab31431c73226ee3ed485edd1 Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Tue, 3 Dec 2013 07:02:27 +0800 Subject: [PATCH 3/4] KVM/X86: Intel MPX vmx and msr handle Signed-off-by: Xudong Hao xudong@intel.com Signed-off-by: Liu Jinsong jinsong@intel.com --- arch/x86/include/asm/vmx.h|2 ++ arch/x86/include/uapi/asm/msr-index.h |1 + arch/x86/kvm/vmx.c| 12 ++-- 3 files changed, 13 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h index 966502d..1bf4681 100644 --- a/arch/x86/include/asm/vmx.h +++ b/arch/x86/include/asm/vmx.h @@ -85,6 +85,7 @@ #define VM_EXIT_SAVE_IA32_EFER 0x0010 #define VM_EXIT_LOAD_IA32_EFER 0x0020 #define VM_EXIT_SAVE_VMX_PREEMPTION_TIMER 0x0040 +#define VM_EXIT_CLEAR_BNDCFGS 0x0080 #define VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR 0x00036dff @@ -95,6 +96,7 @@ #define VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL 0x2000 #define VM_ENTRY_LOAD_IA32_PAT 0x4000 #define VM_ENTRY_LOAD_IA32_EFER 0x8000 +#define VM_ENTRY_LOAD_BNDCFGS 0x0001 #define VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR 0x11ff diff --git a/arch/x86/include/uapi/asm/msr-index.h b/arch/x86/include/uapi/asm/msr-index.h index 37813b5..2a418c4 100644 --- a/arch/x86/include/uapi/asm/msr-index.h +++ b/arch/x86/include/uapi/asm/msr-index.h @@ -294,6 +294,7 @@ #define MSR_SMI_COUNT 0x0034 #define MSR_IA32_FEATURE_CONTROL0x003a #define MSR_IA32_TSC_ADJUST 0x003b +#define MSR_IA32_BNDCFGS 0x0d90 #define FEATURE_CONTROL_LOCKED (10) #define FEATURE_CONTROL_VMXON_ENABLED_INSIDE_SMX (11) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index b2fe1c2..9a16e60 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -439,6 +439,7 @@ struct vcpu_vmx { #endif int gs_ldt_reload_needed; int fs_reload_needed; + u64 msr_host_bndcfgs; } host_state; struct { int vm86_active; @@ -1647,6 +1648,8 @@ static void vmx_save_host_state(struct kvm_vcpu *vcpu) if (is_long_mode(vmx-vcpu)) wrmsrl(MSR_KERNEL_GS_BASE, vmx-msr_guest_kernel_gs_base); #endif + if (cpu_has_mpx) + rdmsrl(MSR_IA32_BNDCFGS, vmx-host_state.msr_host_bndcfgs); for (i = 0; i vmx-save_nmsrs; ++i) kvm_set_shared_msr(vmx-guest_msrs[i].index, vmx-guest_msrs[i].data, @@ -1684,6 +1687,8 @@ static void __vmx_load_host_state(struct vcpu_vmx *vmx) #ifdef CONFIG_X86_64 wrmsrl(MSR_KERNEL_GS_BASE, vmx-msr_host_kernel_gs_base); #endif + if (vmx-host_state.msr_host_bndcfgs) + wrmsrl(MSR_IA32_BNDCFGS, vmx-host_state.msr_host_bndcfgs); /* * If the FPU is not active (through the host task or * the guest vcpu), then restore the cr0.TS bit. @@ -2800,7 +2805,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf) min |= VM_EXIT_HOST_ADDR_SPACE_SIZE; #endif opt = VM_EXIT_SAVE_IA32_PAT | VM_EXIT_LOAD_IA32_PAT | - VM_EXIT_ACK_INTR_ON_EXIT; + VM_EXIT_ACK_INTR_ON_EXIT | VM_EXIT_CLEAR_BNDCFGS; if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_EXIT_CTLS, _vmexit_control) 0) return -EIO; @@ -2817,7 +2822,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf) _pin_based_exec_control = ~PIN_BASED_POSTED_INTR; min = 0; - opt = VM_ENTRY_LOAD_IA32_PAT; + opt = VM_ENTRY_LOAD_IA32_PAT | VM_ENTRY_LOAD_BNDCFGS; if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_ENTRY_CTLS, _vmentry_control) 0) return -EIO; @@ -8636,6 +8641,9 @@ static int __init vmx_init(void) vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_CS, false); vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_ESP, false); vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_EIP, false); + if (cpu_has_mpx) + vmx_disable_intercept_for_msr(MSR_IA32_BNDCFGS, true); + memcpy(vmx_msr_bitmap_legacy_x2apic, vmx_msr_bitmap_legacy, PAGE_SIZE); memcpy(vmx_msr_bitmap_longmode_x2apic, -- 1.7.1 0003-KVM-X86-Intel-MPX-vmx-and-msr-handle.patch Description: 0003-KVM-X86-Intel-MPX-vmx-and-msr-handle.patch
[PATCH 4/4] KVM/X86: Enable Intel MPX for guest
From 62553aebb7b72f1203fefc59dd4d8969e4216ddb Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Tue, 3 Dec 2013 07:34:32 +0800 Subject: [PATCH 4/4] KVM/X86: Enable Intel MPX for guest Signed-off-by: Xudong Hao xudong@intel.com Signed-off-by: Liu Jinsong jinsong@intel.com --- arch/x86/kvm/cpuid.c |2 +- arch/x86/kvm/x86.c |3 +++ arch/x86/kvm/x86.h |3 ++- 3 files changed, 6 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 2d661e6..e30d4ce 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -303,7 +303,7 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* cpuid 7.0.ebx */ const u32 kvm_supported_word9_x86_features = F(FSGSBASE) | F(BMI1) | F(HLE) | F(AVX2) | F(SMEP) | - F(BMI2) | F(ERMS) | f_invpcid | F(RTM); + F(BMI2) | F(ERMS) | f_invpcid | F(RTM) | F(MPX); /* all calls to cpuid_count() should be made on the same cpu */ get_cpu(); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 1657ca2..950fdd1 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -597,6 +597,9 @@ int __kvm_set_xcr(struct kvm_vcpu *vcpu, u32 index, u64 xcr) if (xcr0 ~valid_bits) return 1; + if ((!(xcr0 XSTATE_BNDREGS)) != (!(xcr0 XSTATE_BNDCSR))) + return 1; + kvm_put_guest_xcr0(vcpu); vcpu-arch.xcr0 = xcr0; diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 587fb9e..985e40e 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -122,7 +122,8 @@ int kvm_write_guest_virt_system(struct x86_emulate_ctxt *ctxt, gva_t addr, void *val, unsigned int bytes, struct x86_exception *exception); -#define KVM_SUPPORTED_XCR0 (XSTATE_FP | XSTATE_SSE | XSTATE_YMM) +#define KVM_SUPPORTED_XCR0 (XSTATE_FP | XSTATE_SSE | XSTATE_YMM \ + | XSTATE_BNDREGS | XSTATE_BNDCSR) extern u64 host_xcr0; extern struct static_key kvm_no_apic_vcpu; -- 1.7.1 0004-KVM-X86-Enable-Intel-MPX-for-guest.patch Description: 0004-KVM-X86-Enable-Intel-MPX-for-guest.patch
[PATCH 0/2] Intel MPX support at Qemu side
Intel has released new version of Intel Architecture Instruction Set Extensions Programming Reference, adding new features like AVX-512, MPX, etc. Refer to http://download-software.intel.com/sites/default/files/319433-015.pdf These 2 patches are prepare patches at qemu side to support Intel MPX feature. PATCH 1/2 is to fix a minor bug which parse cpuid leaf 0x0d; PATCH 2/2 expose cpuid leaf (0xd, 3) and (0xd, 4) to guest, and fix ebx and re-calculate ecx of cpuid leaf (0xd, 0); Thanks, Jinsong -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] target-i386: fix cpuid leaf 0x0d
From e4b58c7bafc4d9f913a572a1b1cfee91c92f1637 Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Fri, 22 Nov 2013 00:24:16 +0800 Subject: [PATCH 1/2] target-i386: fix cpuid leaf 0x0d Fix cpuid leaf 0x0d which incorrectly parsed eax and ebx. Signed-off-by: Liu Jinsong jinsong@intel.com --- target-i386/cpu.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/target-i386/cpu.c b/target-i386/cpu.c index 864c80e..544b57f 100644 --- a/target-i386/cpu.c +++ b/target-i386/cpu.c @@ -335,7 +335,7 @@ typedef struct ExtSaveArea { static const ExtSaveArea ext_save_areas[] = { [2] = { .feature = FEAT_1_ECX, .bits = CPUID_EXT_AVX, -.offset = 0x100, .size = 0x240 }, +.offset = 0x240, .size = 0x100 }, }; const char *get_register_name_32(unsigned int reg) @@ -2225,8 +2225,8 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, const ExtSaveArea *esa = ext_save_areas[count]; if ((env-features[esa-feature] esa-bits) == esa-bits (kvm_mask (1 count)) != 0) { -*eax = esa-offset; -*ebx = esa-size; +*eax = esa-size; +*ebx = esa-offset; } } break; -- 1.7.1-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] target-i386: Intel MPX support
From aac033473bc88befe39a9add99820c0a7118ac90 Mon Sep 17 00:00:00 2001 From: root root@ljs.(none) Date: Fri, 22 Nov 2013 00:24:35 +0800 Subject: [PATCH 2/2] target-i386: Intel MPX support Expose cpuid leaf (0xd, 3) and (0xd, 4) to guest. Fix ebx and re-calculate ecx of cpuid leaf (0xd, 0). Signed-off-by: Liu Jinsong jinsong@intel.com --- target-i386/cpu.c | 34 ++ target-i386/cpu.h |1 + 2 files changed, 27 insertions(+), 8 deletions(-) diff --git a/target-i386/cpu.c b/target-i386/cpu.c index 544b57f..7d04f28 100644 --- a/target-i386/cpu.c +++ b/target-i386/cpu.c @@ -330,12 +330,12 @@ X86RegisterInfo32 x86_reg_info_32[CPU_NB_REGS32] = { typedef struct ExtSaveArea { uint32_t feature, bits; -uint32_t offset, size; } ExtSaveArea; static const ExtSaveArea ext_save_areas[] = { -[2] = { .feature = FEAT_1_ECX, .bits = CPUID_EXT_AVX, -.offset = 0x240, .size = 0x100 }, +[2] = { .feature = FEAT_1_ECX, .bits = CPUID_EXT_AVX }, +[3] = { .feature = FEAT_7_0_EBX, .bits = CPUID_7_0_EBX_MPX }, +[4] = { .feature = FEAT_7_0_EBX, .bits = CPUID_7_0_EBX_MPX }, }; const char *get_register_name_32(unsigned int reg) @@ -2204,9 +2204,11 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, ((uint64_t)kvm_arch_get_supported_cpuid(s, 0xd, 0, R_EDX) 32); if (count == 0) { -*ecx = 0x240; +*ebx = *ecx = 0x240; for (i = 2; i ARRAY_SIZE(ext_save_areas); i++) { +uint32_t offset, size; const ExtSaveArea *esa = ext_save_areas[i]; + if ((env-features[esa-feature] esa-bits) == esa-bits (kvm_mask (1 i)) != 0) { if (i 32) { @@ -2214,19 +2216,35 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, } else { *edx |= 1 (i - 32); } -*ecx = MAX(*ecx, esa-offset + esa-size); + +size = kvm_arch_get_supported_cpuid(s, 0xd, i, R_EAX); +offset = kvm_arch_get_supported_cpuid(s, 0xd, i, R_EBX); +*ecx = MAX(*ecx, offset + size); + +/* + * EBX here just in order to + * 1. keep compatible with old qemu version, take AVX + *into account; + * 2. keep compatible with old kernel version. Currently + *KVM has bug when expose cpuid 0xd to guest (include + *static value when guest booting and dynamic value + *when guest enables XCR0 features. EBX here can + *co-work with old buggy and new updated KVM, keep + *same value independent to CPU and kernel version. + */ +if (i == 2) +*ebx = MAX(*ebx, offset + size); } } *eax |= kvm_mask (XSTATE_FP | XSTATE_SSE); -*ebx = *ecx; } else if (count == 1) { *eax = kvm_arch_get_supported_cpuid(s, 0xd, 1, R_EAX); } else if (count ARRAY_SIZE(ext_save_areas)) { const ExtSaveArea *esa = ext_save_areas[count]; if ((env-features[esa-feature] esa-bits) == esa-bits (kvm_mask (1 count)) != 0) { -*eax = esa-size; -*ebx = esa-offset; +*eax = kvm_arch_get_supported_cpuid(s, 0xd, count, R_EAX); +*ebx = kvm_arch_get_supported_cpuid(s, 0xd, count, R_EBX); } } break; diff --git a/target-i386/cpu.h b/target-i386/cpu.h index ea373e8..9a838d1 100644 --- a/target-i386/cpu.h +++ b/target-i386/cpu.h @@ -545,6 +545,7 @@ typedef uint32_t FeatureWordArray[FEATURE_WORDS]; #define CPUID_7_0_EBX_ERMS (1 9) #define CPUID_7_0_EBX_INVPCID (1 10) #define CPUID_7_0_EBX_RTM (1 11) +#define CPUID_7_0_EBX_MPX (1 14) #define CPUID_7_0_EBX_RDSEED (1 18) #define CPUID_7_0_EBX_ADX (1 19) #define CPUID_7_0_EBX_SMAP (1 20) -- 1.7.1-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/4] Intel MPX support for KVM
Intel has released new version of Intel Architecture Instruction Set Extensions Programming Reference, adding new features like AVX-512, MPX, etc. Refer to http://download-software.intel.com/sites/default/files/319433-015.pdf These patches are to support Intel MPX for KVM. PATCH 1/4 is some MPX definiation; PATCH 2/4 re-calculate cpuid(0xd,0) EBX; PATCH 3/4 enable Intel Memory Protection Extension for guest; PATCH 4/4 is Intel MPX vmx and msr handle; These pathes are based on my ex-colleague Xudong's work, now I help him to push these patches. Thanks, Jinsong-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/4] X86: Intel MPX definiation
From 3a1a011100b38a275d8c95468c12c483e316bb15 Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Fri, 29 Nov 2013 01:27:00 +0800 Subject: [PATCH 1/4] X86: Intel MPX definiation Signed-off-by: Xudong Hao xudong@intel.com Reviewed-by: Liu Jinsong jinsong@intel.com --- arch/x86/include/asm/cpufeature.h |2 ++ arch/x86/include/asm/xsave.h |5 - 2 files changed, 6 insertions(+), 1 deletions(-) diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index 89270b4..1b00b01 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -216,6 +216,7 @@ #define X86_FEATURE_ERMS (9*32+ 9) /* Enhanced REP MOVSB/STOSB */ #define X86_FEATURE_INVPCID(9*32+10) /* Invalidate Processor Context ID */ #define X86_FEATURE_RTM(9*32+11) /* Restricted Transactional Memory */ +#define X86_FEATURE_MPX(9*32+14) /* Memory Protection Extension */ #define X86_FEATURE_RDSEED (9*32+18) /* The RDSEED instruction */ #define X86_FEATURE_ADX(9*32+19) /* The ADCX and ADOX instructions */ #define X86_FEATURE_SMAP (9*32+20) /* Supervisor Mode Access Prevention */ @@ -330,6 +331,7 @@ extern const char * const x86_power_flags[32]; #define cpu_has_perfctr_l2 boot_cpu_has(X86_FEATURE_PERFCTR_L2) #define cpu_has_cx8boot_cpu_has(X86_FEATURE_CX8) #define cpu_has_cx16 boot_cpu_has(X86_FEATURE_CX16) +#define cpu_has_mpxboot_cpu_has(X86_FEATURE_MPX) #define cpu_has_eager_fpu boot_cpu_has(X86_FEATURE_EAGER_FPU) #define cpu_has_topoextboot_cpu_has(X86_FEATURE_TOPOEXT) diff --git a/arch/x86/include/asm/xsave.h b/arch/x86/include/asm/xsave.h index 0415cda..d3e3ea5 100644 --- a/arch/x86/include/asm/xsave.h +++ b/arch/x86/include/asm/xsave.h @@ -9,6 +9,8 @@ #define XSTATE_FP 0x1 #define XSTATE_SSE 0x2 #define XSTATE_YMM 0x4 +#define XSTATE_BNDREGS 0x8 +#define XSTATE_BNDCSR 0x10 #define XSTATE_FPSSE (XSTATE_FP | XSTATE_SSE) @@ -23,7 +25,8 @@ /* * These are the features that the OS can handle currently. */ -#define XCNTXT_MASK(XSTATE_FP | XSTATE_SSE | XSTATE_YMM) +#define XCNTXT_MASK(XSTATE_FP | XSTATE_SSE | XSTATE_YMM | \ + XSTATE_BNDREGS | XSTATE_BNDCSR) #ifdef CONFIG_X86_64 #define REX_PREFIX 0x48, -- 1.7.1 0001-X86-Intel-MPX-definiation.patch Description: 0001-X86-Intel-MPX-definiation.patch
[PATCH 2/4] KVM/X86: Fix xsave cpuid exposing bug
From b060be65e466291c91963e58c4880ec614d0b294 Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Fri, 29 Nov 2013 01:27:53 +0800 Subject: [PATCH 2/4] KVM/X86: Fix xsave cpuid exposing bug EBX of cpuid(0xD, 0) is dynamic per XCR0 features enable/disable. Bit 63 of XCR0 is reserved for future expansion. Signed-off-by: Liu Jinsong jinsong@intel.com --- arch/x86/include/asm/xsave.h |2 ++ arch/x86/kvm/cpuid.c |4 ++-- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/xsave.h b/arch/x86/include/asm/xsave.h index d3e3ea5..6120e74 100644 --- a/arch/x86/include/asm/xsave.h +++ b/arch/x86/include/asm/xsave.h @@ -13,6 +13,8 @@ #define XSTATE_BNDCSR 0x10 #define XSTATE_FPSSE (XSTATE_FP | XSTATE_SSE) +/* Bit 63 of XCR0 is reserved for future expansion */ +#define XSTATE_EXTEND_MASK (~(XSTATE_FPSSE | (1 63))) #define FXSAVE_SIZE512 diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index c697625..a8ce117 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -28,7 +28,7 @@ static u32 xstate_required_size(u64 xstate_bv) int feature_bit = 0; u32 ret = XSAVE_HDR_SIZE + XSAVE_HDR_OFFSET; - xstate_bv = ~XSTATE_FPSSE; + xstate_bv = XSTATE_EXTEND_MASK; while (xstate_bv) { if (xstate_bv 0x1) { u32 eax, ebx, ecx, edx; @@ -74,7 +74,7 @@ void kvm_update_cpuid(struct kvm_vcpu *vcpu) vcpu-arch.guest_supported_xcr0 = (best-eax | ((u64)best-edx 32)) host_xcr0 KVM_SUPPORTED_XCR0; - vcpu-arch.guest_xstate_size = + vcpu-arch.guest_xstate_size = best-ebx = xstate_required_size(vcpu-arch.guest_supported_xcr0); } -- 1.7.1 0002-KVM-X86-Fix-xsave-cpuid-exposing-bug.patch Description: 0002-KVM-X86-Fix-xsave-cpuid-exposing-bug.patch
[PATCH 3/4] KVM/X86: Enable Intel MPX for guest
From 11ae33723027c7b8e53a8c109f127800d7f0ad6e Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Fri, 29 Nov 2013 01:28:19 +0800 Subject: [PATCH 3/4] KVM/X86: Enable Intel MPX for guest Enable Intel Memory Protection Extension for guest. Signed-off-by: Xudong Hao xudong@intel.com Reviewed-by: Liu Jinsong jinsong@intel.com --- arch/x86/kvm/cpuid.c |4 ++-- arch/x86/kvm/x86.c | 14 -- arch/x86/kvm/x86.h |3 ++- 3 files changed, 16 insertions(+), 5 deletions(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index a8ce117..e30d4ce 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -75,7 +75,7 @@ void kvm_update_cpuid(struct kvm_vcpu *vcpu) (best-eax | ((u64)best-edx 32)) host_xcr0 KVM_SUPPORTED_XCR0; vcpu-arch.guest_xstate_size = best-ebx = - xstate_required_size(vcpu-arch.guest_supported_xcr0); + xstate_required_size(vcpu-arch.xcr0); } kvm_pmu_cpuid_update(vcpu); @@ -303,7 +303,7 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* cpuid 7.0.ebx */ const u32 kvm_supported_word9_x86_features = F(FSGSBASE) | F(BMI1) | F(HLE) | F(AVX2) | F(SMEP) | - F(BMI2) | F(ERMS) | f_invpcid | F(RTM); + F(BMI2) | F(ERMS) | f_invpcid | F(RTM) | F(MPX); /* all calls to cpuid_count() should be made on the same cpu */ get_cpu(); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 21ef1ba..6e38698 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -576,13 +576,13 @@ static void kvm_put_guest_xcr0(struct kvm_vcpu *vcpu) int __kvm_set_xcr(struct kvm_vcpu *vcpu, u32 index, u64 xcr) { - u64 xcr0; + u64 xcr0 = xcr; + u64 old_xcr0 = vcpu-arch.xcr0; u64 valid_bits; /* Only support XCR_XFEATURE_ENABLED_MASK(xcr0) now */ if (index != XCR_XFEATURE_ENABLED_MASK) return 1; - xcr0 = xcr; if (!(xcr0 XSTATE_FP)) return 1; if ((xcr0 XSTATE_YMM) !(xcr0 XSTATE_SSE)) @@ -597,8 +597,15 @@ int __kvm_set_xcr(struct kvm_vcpu *vcpu, u32 index, u64 xcr) if (xcr0 ~valid_bits) return 1; + if ((!(xcr0 XSTATE_BNDREGS)) != (!(xcr0 XSTATE_BNDCSR))) + return 1; + kvm_put_guest_xcr0(vcpu); vcpu-arch.xcr0 = xcr0; + + if ((xcr0 ^ old_xcr0) XSTATE_EXTEND_MASK) + kvm_update_cpuid(vcpu); + return 0; } @@ -5960,6 +5967,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) preempt_disable(); kvm_x86_ops-prepare_guest_switch(vcpu); + if (kvm_read_cr4_bits(vcpu, X86_CR4_OSXSAVE) + (vcpu-arch.xcr0 (u64)(XSTATE_BNDREGS | XSTATE_BNDCSR))) + kvm_x86_ops-fpu_activate(vcpu); if (vcpu-fpu_active) kvm_load_guest_fpu(vcpu); kvm_load_guest_xcr0(vcpu); diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 587fb9e..985e40e 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -122,7 +122,8 @@ int kvm_write_guest_virt_system(struct x86_emulate_ctxt *ctxt, gva_t addr, void *val, unsigned int bytes, struct x86_exception *exception); -#define KVM_SUPPORTED_XCR0 (XSTATE_FP | XSTATE_SSE | XSTATE_YMM) +#define KVM_SUPPORTED_XCR0 (XSTATE_FP | XSTATE_SSE | XSTATE_YMM \ + | XSTATE_BNDREGS | XSTATE_BNDCSR) extern u64 host_xcr0; extern struct static_key kvm_no_apic_vcpu; -- 1.7.1 0003-KVM-X86-Enable-Intel-MPX-for-guest.patch Description: 0003-KVM-X86-Enable-Intel-MPX-for-guest.patch
[PATCH 4/4] KVM/X86: Intel MPX vmx and msr handle
From 7532bdffe9f74db65f6eff733cb227a66bef932e Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Sat, 30 Nov 2013 00:27:02 +0800 Subject: [PATCH 4/4] KVM/X86: Intel MPX vmx and msr handle Signed-off-by: Xudong Hao xudong@intel.com Reviewed-by: Liu Jinsong jinsong@intel.com --- arch/x86/include/asm/vmx.h|2 ++ arch/x86/include/uapi/asm/msr-index.h |1 + arch/x86/kvm/vmx.c| 13 +++-- 3 files changed, 14 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h index 966502d..1bf4681 100644 --- a/arch/x86/include/asm/vmx.h +++ b/arch/x86/include/asm/vmx.h @@ -85,6 +85,7 @@ #define VM_EXIT_SAVE_IA32_EFER 0x0010 #define VM_EXIT_LOAD_IA32_EFER 0x0020 #define VM_EXIT_SAVE_VMX_PREEMPTION_TIMER 0x0040 +#define VM_EXIT_CLEAR_BNDCFGS 0x0080 #define VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR 0x00036dff @@ -95,6 +96,7 @@ #define VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL 0x2000 #define VM_ENTRY_LOAD_IA32_PAT 0x4000 #define VM_ENTRY_LOAD_IA32_EFER 0x8000 +#define VM_ENTRY_LOAD_BNDCFGS 0x0001 #define VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR 0x11ff diff --git a/arch/x86/include/uapi/asm/msr-index.h b/arch/x86/include/uapi/asm/msr-index.h index 37813b5..2a418c4 100644 --- a/arch/x86/include/uapi/asm/msr-index.h +++ b/arch/x86/include/uapi/asm/msr-index.h @@ -294,6 +294,7 @@ #define MSR_SMI_COUNT 0x0034 #define MSR_IA32_FEATURE_CONTROL0x003a #define MSR_IA32_TSC_ADJUST 0x003b +#define MSR_IA32_BNDCFGS 0x0d90 #define FEATURE_CONTROL_LOCKED (10) #define FEATURE_CONTROL_VMXON_ENABLED_INSIDE_SMX (11) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index b2fe1c2..aa23edf 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -439,6 +439,7 @@ struct vcpu_vmx { #endif int gs_ldt_reload_needed; int fs_reload_needed; + u64 msr_host_bndcfgs; } host_state; struct { int vm86_active; @@ -1647,6 +1648,8 @@ static void vmx_save_host_state(struct kvm_vcpu *vcpu) if (is_long_mode(vmx-vcpu)) wrmsrl(MSR_KERNEL_GS_BASE, vmx-msr_guest_kernel_gs_base); #endif + if (cpu_has_mpx) + rdmsrl(MSR_IA32_BNDCFGS, vmx-host_state.msr_host_bndcfgs); for (i = 0; i vmx-save_nmsrs; ++i) kvm_set_shared_msr(vmx-guest_msrs[i].index, vmx-guest_msrs[i].data, @@ -1684,6 +1687,8 @@ static void __vmx_load_host_state(struct vcpu_vmx *vmx) #ifdef CONFIG_X86_64 wrmsrl(MSR_KERNEL_GS_BASE, vmx-msr_host_kernel_gs_base); #endif + if (cpu_has_mpx) + wrmsrl(MSR_IA32_BNDCFGS, vmx-host_state.msr_host_bndcfgs); /* * If the FPU is not active (through the host task or * the guest vcpu), then restore the cr0.TS bit. @@ -2800,7 +2805,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf) min |= VM_EXIT_HOST_ADDR_SPACE_SIZE; #endif opt = VM_EXIT_SAVE_IA32_PAT | VM_EXIT_LOAD_IA32_PAT | - VM_EXIT_ACK_INTR_ON_EXIT; + VM_EXIT_ACK_INTR_ON_EXIT | VM_EXIT_CLEAR_BNDCFGS; if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_EXIT_CTLS, _vmexit_control) 0) return -EIO; @@ -2817,7 +2822,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf) _pin_based_exec_control = ~PIN_BASED_POSTED_INTR; min = 0; - opt = VM_ENTRY_LOAD_IA32_PAT; + opt = VM_ENTRY_LOAD_IA32_PAT | VM_ENTRY_LOAD_BNDCFGS; if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_ENTRY_CTLS, _vmentry_control) 0) return -EIO; @@ -8636,6 +8641,10 @@ static int __init vmx_init(void) vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_CS, false); vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_ESP, false); vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_EIP, false); + if ((vmcs_config.vmentry_ctrl VM_ENTRY_LOAD_BNDCFGS) + (vmcs_config.vmexit_ctrl VM_EXIT_CLEAR_BNDCFGS)) + vmx_disable_intercept_for_msr(MSR_IA32_BNDCFGS, true); + memcpy(vmx_msr_bitmap_legacy_x2apic, vmx_msr_bitmap_legacy, PAGE_SIZE); memcpy(vmx_msr_bitmap_longmode_x2apic, -- 1.7.1 0004-KVM-X86-Intel-MPX-vmx-and-msr-handle.patch Description: 0004-KVM-X86-Intel-MPX-vmx-and-msr-handle.patch
RE: [PATCH 1/2] target-i386: fix cpuid leaf 0x0d
Paolo Bonzini wrote: Il 29/11/2013 14:15, Liu, Jinsong ha scritto: From e4b58c7bafc4d9f913a572a1b1cfee91c92f1637 Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Fri, 22 Nov 2013 00:24:16 +0800 Subject: [PATCH 1/2] target-i386: fix cpuid leaf 0x0d Fix cpuid leaf 0x0d which incorrectly parsed eax and ebx. There is no visible change right (the two hunks cancel each other)? Since you will have to post a v2, please make this explicit in the commit message. OK, will add explicit commit message, or, drop this patch if needed. Thanks, Jinsong Signed-off-by: Liu Jinsong jinsong@intel.com --- target-i386/cpu.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/target-i386/cpu.c b/target-i386/cpu.c index 864c80e..544b57f 100644 --- a/target-i386/cpu.c +++ b/target-i386/cpu.c @@ -335,7 +335,7 @@ typedef struct ExtSaveArea { static const ExtSaveArea ext_save_areas[] = { [2] = { .feature = FEAT_1_ECX, .bits = CPUID_EXT_AVX, -.offset = 0x100, .size = 0x240 }, +.offset = 0x240, .size = 0x100 }, }; const char *get_register_name_32(unsigned int reg) @@ -2225,8 +2225,8 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, const ExtSaveArea *esa = ext_save_areas[count]; if ((env-features[esa-feature] esa-bits) == esa-bits (kvm_mask (1 count)) != 0) { -*eax = esa-offset; -*ebx = esa-size; +*eax = esa-size; +*ebx = esa-offset; } } break; -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/2] target-i386: Intel MPX support
Paolo Bonzini wrote: Il 29/11/2013 14:17, Liu, Jinsong ha scritto: From aac033473bc88befe39a9add99820c0a7118ac90 Mon Sep 17 00:00:00 2001 From: root root@ljs.(none) Date: Fri, 22 Nov 2013 00:24:35 +0800 Subject: [PATCH 2/2] target-i386: Intel MPX support Expose cpuid leaf (0xd, 3) and (0xd, 4) to guest. Fix ebx and re-calculate ecx of cpuid leaf (0xd, 0). There is no reason to get the size and offset from the host. Peter Anvin confirmed that the sizes and offsets will never change (as should be the case for migration to work across different CPU versions). In fact, the size and offset is documented for every XSAVE feature except MPX in the copy I have of the Intel documentation. If the sizes and offsets will never change, what's the bad effect of getting them from host? Please get the size and offset from the documentation, if it has been updated, or from a real host, and hardcode them in QEMU. Hmm, the problem is what I get is not equal to real test :( For example, I was told XSTATE_BNDCSR_SIZE is 0x40, but real test shows it's 0x10. Maybe getting from real h/w is not bad than hardcode it? Thanks Jinsong Signed-off-by: Liu Jinsong jinsong@intel.com --- target-i386/cpu.c | 34 ++ target-i386/cpu.h |1 + 2 files changed, 27 insertions(+), 8 deletions(-) diff --git a/target-i386/cpu.c b/target-i386/cpu.c index 544b57f..7d04f28 100644 --- a/target-i386/cpu.c +++ b/target-i386/cpu.c @@ -330,12 +330,12 @@ X86RegisterInfo32 x86_reg_info_32[CPU_NB_REGS32] = { typedef struct ExtSaveArea { uint32_t feature, bits; -uint32_t offset, size; } ExtSaveArea; static const ExtSaveArea ext_save_areas[] = { -[2] = { .feature = FEAT_1_ECX, .bits = CPUID_EXT_AVX, -.offset = 0x240, .size = 0x100 }, +[2] = { .feature = FEAT_1_ECX, .bits = CPUID_EXT_AVX }, +[3] = { .feature = FEAT_7_0_EBX, .bits = CPUID_7_0_EBX_MPX }, +[4] = { .feature = FEAT_7_0_EBX, .bits = CPUID_7_0_EBX_MPX }, }; const char *get_register_name_32(unsigned int reg) @@ -2204,9 +2204,11 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, ((uint64_t)kvm_arch_get_supported_cpuid(s, 0xd, 0, R_EDX) 32); if (count == 0) { -*ecx = 0x240; +*ebx = *ecx = 0x240; for (i = 2; i ARRAY_SIZE(ext_save_areas); i++) { +uint32_t offset, size; const ExtSaveArea *esa = ext_save_areas[i]; + if ((env-features[esa-feature] esa-bits) == esa-bits (kvm_mask (1 i)) != 0) { if (i 32) { @@ -2214,19 +2216,35 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, } else { *edx |= 1 (i - 32); } -*ecx = MAX(*ecx, esa-offset + esa-size); + +size = kvm_arch_get_supported_cpuid(s, 0xd, i, R_EAX); +offset = kvm_arch_get_supported_cpuid(s, 0xd, i, R_EBX); + *ecx = MAX(*ecx, offset + size); + +/* + * EBX here just in order to + * 1. keep compatible with old qemu version, take AVX + *into account; + * 2. keep compatible with old kernel version. Currently + *KVM has bug when expose cpuid 0xd to guest (include + *static value when guest booting and dynamic value + *when guest enables XCR0 features. EBX here can + * co-work with old buggy and new updated KVM, keep + *same value independent to CPU and kernel version. + */ +if (i == 2) +*ebx = MAX(*ebx, offset + size); } } *eax |= kvm_mask (XSTATE_FP | XSTATE_SSE); - *ebx = *ecx; } else if (count == 1) { *eax = kvm_arch_get_supported_cpuid(s, 0xd, 1, R_EAX); } else if (count ARRAY_SIZE(ext_save_areas)) { const ExtSaveArea *esa = ext_save_areas[count]; if ((env-features[esa-feature] esa-bits) == esa-bits (kvm_mask (1 count)) != 0) { -*eax = esa-size; -*ebx = esa-offset; +*eax = kvm_arch_get_supported_cpuid(s, 0xd, count, R_EAX); +*ebx = kvm_arch_get_supported_cpuid(s, 0xd, count, R_EBX); } } break; diff --git a/target-i386/cpu.h b/target-i386/cpu.h index ea373e8..9a838d1 100644 --- a/target-i386/cpu.h +++ b/target-i386/cpu.h @@ -545,6 +545,7 @@ typedef uint32_t FeatureWordArray[FEATURE_WORDS]; #define CPUID_7_0_EBX_ERMS (1 9) #define CPUID_7_0_EBX_INVPCID (1 10) #define CPUID_7_0_EBX_RTM (1 11) +#define
RE: [PATCH 3/4] KVM/X86: Enable Intel MPX for guest
Paolo Bonzini wrote: diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index a8ce117..e30d4ce 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -75,7 +75,7 @@ void kvm_update_cpuid(struct kvm_vcpu *vcpu) (best-eax | ((u64)best-edx 32)) host_xcr0 KVM_SUPPORTED_XCR0; vcpu-arch.guest_xstate_size = best-ebx = -xstate_required_size(vcpu-arch.guest_supported_xcr0); +xstate_required_size(vcpu-arch.xcr0); } kvm_pmu_cpuid_update(vcpu); ... kvm_put_guest_xcr0(vcpu); vcpu-arch.xcr0 = xcr0; + +if ((xcr0 ^ old_xcr0) XSTATE_EXTEND_MASK) +kvm_update_cpuid(vcpu); + return 0; } These hunks should be part of the previous patch. @@ -5960,6 +5967,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) preempt_disable(); kvm_x86_ops-prepare_guest_switch(vcpu); +if (kvm_read_cr4_bits(vcpu, X86_CR4_OSXSAVE) Shouldn't be necessary, setting xcr0 fails unless OSXSAVE=1. +(vcpu-arch.xcr0 (u64)(XSTATE_BNDREGS | XSTATE_BNDCSR))) +kvm_x86_ops-fpu_activate(vcpu); Can you explain this? No, in fact I'm also some wondering about it, but per it has been tested, I didn't update this code. I will double check and drop it if need (or, maybe Xudong can elaborate more?) Thanks, Jinsong if (vcpu-fpu_active) kvm_load_guest_fpu(vcpu); kvm_load_guest_xcr0(vcpu); diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 587fb9e..985e40e 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -122,7 +122,8 @@ int kvm_write_guest_virt_system(struct x86_emulate_ctxt *ctxt, gva_t addr, void *val, unsigned int bytes, struct x86_exception *exception); -#define KVM_SUPPORTED_XCR0 (XSTATE_FP | XSTATE_SSE | XSTATE_YMM) +#define KVM_SUPPORTED_XCR0 (XSTATE_FP | XSTATE_SSE | XSTATE_YMM \ +| XSTATE_BNDREGS | XSTATE_BNDCSR) extern u64 host_xcr0; extern struct static_key kvm_no_apic_vcpu; Otherwise looks straightforward. Thanks, will update per your comments. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [Qemu-devel] [PATCH] qemu-kvm bugfix for IA32_FEATURE_CONTROL
Paolo Bonzini wrote: Il 19/08/2013 16:59, Andreas Färber ha scritto: qemu-kvm is no longer maintained since 1.3 so it should not be occurring any more. Please use a prefix of target-i386: (the directory name) to signal where you are changing code, i.e. x86 only. bugfix is not a very telling description of what a patch is doing. (Up to Paolo and Gleb whether they'll fix it or whether they require a resend.) No, not this time at least. :) Paolo Thanks Paolo, and Andreas's comments is also good, so I update commit message and will send out later. Regards, Jinsong -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] qemu-kvm bugfix for IA32_FEATURE_CONTROL
Paolo Bonzini wrote: The patch looks good. Please repost it with checkpatch.pl failures fixed. Paolo Thanks Stefan and Paolo! Updated patch attached. Regards, Jinsong === From a0ddf948d40e42de862543157a5668a1c12faae6 Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Mon, 19 Aug 2013 09:33:30 +0800 Subject: [PATCH] qemu-kvm bugfix for IA32_FEATURE_CONTROL This patch is to fix the bug https://bugs.launchpad.net/qemu-kvm/+bug/1207623 IA32_FEATURE_CONTROL is pointless if not expose VMX or SMX bits to cpuid.1.ecx of vcpu. Current qemu-kvm will error return when kvm_put_msrs or kvm_get_msrs. Signed-off-by: Liu Jinsong jinsong@intel.com --- target-i386/kvm.c | 17 +++-- 1 files changed, 15 insertions(+), 2 deletions(-) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 84ac00a..5adeb03 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -65,6 +65,7 @@ static bool has_msr_star; static bool has_msr_hsave_pa; static bool has_msr_tsc_adjust; static bool has_msr_tsc_deadline; +static bool has_msr_feature_control; static bool has_msr_async_pf_en; static bool has_msr_pv_eoi_en; static bool has_msr_misc_enable; @@ -644,6 +645,12 @@ int kvm_arch_init_vcpu(CPUState *cs) qemu_add_vm_change_state_handler(cpu_update_state, env); +c = cpuid_find_entry(cpuid_data.cpuid, 1, 0); +if (c) { +has_msr_feature_control = !!(c-ecx CPUID_EXT_VMX) || + !!(c-ecx CPUID_EXT_SMX); +} + cpuid_data.cpuid.padding = 0; r = kvm_vcpu_ioctl(cs, KVM_SET_CPUID2, cpuid_data); if (r) { @@ -1121,7 +1128,10 @@ static int kvm_put_msrs(X86CPU *cpu, int level) if (hyperv_vapic_recommended()) { kvm_msr_entry_set(msrs[n++], HV_X64_MSR_APIC_ASSIST_PAGE, 0); } -kvm_msr_entry_set(msrs[n++], MSR_IA32_FEATURE_CONTROL, env-msr_ia32_feature_control); +if (has_msr_feature_control) { +kvm_msr_entry_set(msrs[n++], MSR_IA32_FEATURE_CONTROL, + env-msr_ia32_feature_control); +} } if (env-mcg_cap) { int i; @@ -1346,7 +1356,9 @@ static int kvm_get_msrs(X86CPU *cpu) if (has_msr_misc_enable) { msrs[n++].index = MSR_IA32_MISC_ENABLE; } -msrs[n++].index = MSR_IA32_FEATURE_CONTROL; +if (has_msr_feature_control) { +msrs[n++].index = MSR_IA32_FEATURE_CONTROL; +} if (!env-tsc_valid) { msrs[n++].index = MSR_IA32_TSC; @@ -1447,6 +1459,7 @@ static int kvm_get_msrs(X86CPU *cpu) break; case MSR_IA32_FEATURE_CONTROL: env-msr_ia32_feature_control = msrs[i].data; +break; default: if (msrs[i].index = MSR_MC0_CTL msrs[i].index MSR_MC0_CTL + (env-mcg_cap 0xff) * 4) { -- 1.7.1 0001-qemu-kvm-bugfix-for-IA32_FEATURE_CONTROL.patch Description: 0001-qemu-kvm-bugfix-for-IA32_FEATURE_CONTROL.patch
RE: [Qemu-devel] [PATCH] qemu-kvm bugfix for IA32_FEATURE_CONTROL
Andreas Färber wrote: Am 19.08.2013 16:31, schrieb Liu, Jinsong: Paolo Bonzini wrote: The patch looks good. Please repost it with checkpatch.pl failures fixed. Paolo Thanks Stefan and Paolo! Updated patch attached. Regards, Jinsong === From a0ddf948d40e42de862543157a5668a1c12faae6 Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Mon, 19 Aug 2013 09:33:30 +0800 Subject: [PATCH] qemu-kvm bugfix for IA32_FEATURE_CONTROL This patch is to fix the bug https://bugs.launchpad.net/qemu-kvm/+bug/1207623 IA32_FEATURE_CONTROL is pointless if not expose VMX or SMX bits to cpuid.1.ecx of vcpu. Current qemu-kvm will error return when kvm_put_msrs or kvm_get_msrs. Signed-off-by: Liu Jinsong jinsong@intel.com Jinsong, if this is for upstream QEMU, then the commit message needs some small improvements: qemu-kvm is no longer maintained since 1.3 so it should not be occurring any more. Thanks Andreas! This patch is for qemu-kvm. Per my understanding, there are some patches firstly checked in qemu-kvm uq/master branch. This patch is to fix c/s 0779caeb1a17f4d3ed14e2925b36ba09b084fb7b of qemu-kvm uq/master branch (which is to co-work w/ kvm IA32_FEATURE_CONTROL, and currently not yet in upstream qemu). This patch is used to fix the bug introduced by 0779caeb1a17f4d3ed14e2925b36ba09b084fb7b of qemu-kvm uq/master branch. The bug is reported as https://bugs.launchpad.net/qemu-kvm/+bug/1207623 https://bugs.launchpad.net/qemu/+bug/1213797 Anything I misunderstand, for upstream qemu and qemu-kvm? Please use a prefix of target-i386: (the directory name) to signal where you are changing code, i.e. x86 only. bugfix is not a very telling description of what a patch is doing. (Up to Paolo and Gleb whether they'll fix it or whether they require a resend.) Also please use git-send-email to submit patches and use PATCH v2 etc. for submission as top-level patch: http://wiki.qemu.org/Contribute/SubmitAPatch Thanks, will update per your comments :) One question inline... --- target-i386/kvm.c | 17 +++-- 1 files changed, 15 insertions(+), 2 deletions(-) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 84ac00a..5adeb03 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -65,6 +65,7 @@ static bool has_msr_star; static bool has_msr_hsave_pa; static bool has_msr_tsc_adjust; static bool has_msr_tsc_deadline; +static bool has_msr_feature_control; static bool has_msr_async_pf_en; static bool has_msr_pv_eoi_en; static bool has_msr_misc_enable; @@ -644,6 +645,12 @@ int kvm_arch_init_vcpu(CPUState *cs) qemu_add_vm_change_state_handler(cpu_update_state, env); +c = cpuid_find_entry(cpuid_data.cpuid, 1, 0); +if (c) { +has_msr_feature_control = !!(c-ecx CPUID_EXT_VMX) || + !!(c-ecx CPUID_EXT_SMX); +} + cpuid_data.cpuid.padding = 0; r = kvm_vcpu_ioctl(cs, KVM_SET_CPUID2, cpuid_data); if (r) { @@ -1121,7 +1128,10 @@ static int kvm_put_msrs(X86CPU *cpu, int level) if (hyperv_vapic_recommended()) { kvm_msr_entry_set(msrs[n++], HV_X64_MSR_APIC_ASSIST_PAGE, 0); } - kvm_msr_entry_set(msrs[n++], MSR_IA32_FEATURE_CONTROL, env-msr_ia32_feature_control); +if (has_msr_feature_control) { + kvm_msr_entry_set(msrs[n++], MSR_IA32_FEATURE_CONTROL, + env-msr_ia32_feature_control); +} } if (env-mcg_cap) { int i; @@ -1346,7 +1356,9 @@ static int kvm_get_msrs(X86CPU *cpu) if (has_msr_misc_enable) { msrs[n++].index = MSR_IA32_MISC_ENABLE; } -msrs[n++].index = MSR_IA32_FEATURE_CONTROL; +if (has_msr_feature_control) { +msrs[n++].index = MSR_IA32_FEATURE_CONTROL; +} if (!env-tsc_valid) { msrs[n++].index = MSR_IA32_TSC; @@ -1447,6 +1459,7 @@ static int kvm_get_msrs(X86CPU *cpu) break; case MSR_IA32_FEATURE_CONTROL: env-msr_ia32_feature_control = msrs[i].data; +break; Was the fallthrough previously intended? Or is this a second, unmentioned bugfix? Hmm, it just add 'break' I think patch 0779caeb1a17f4d3ed14e2925b36ba09b084fb7b forget. Thanks, Jinsong Regards, Andreas default: if (msrs[i].index = MSR_MC0_CTL msrs[i].index MSR_MC0_CTL + (env-mcg_cap 0xff) * 4) { -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[no subject]
From 1273f8b2e5464ec987facf9942fd3ccc0b69087e Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Mon, 19 Aug 2013 09:33:30 +0800 Subject: [PATCH] qemu-kvm bugfix for IA32_FEATURE_CONTROL This patch is to fix the bug https://bugs.launchpad.net/qemu-kvm/+bug/1207623 IA32_FEATURE_CONTROL is pointless if not expose VMX or SMX bits to cpuid.1.ecx of vcpu. Current qemu-kvm will error return when kvm_put_msrs or kvm_get_msrs. Signed-off-by: Liu Jinsong jinsong@intel.com --- target-i386/kvm.c | 16 ++-- 1 files changed, 14 insertions(+), 2 deletions(-) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 84ac00a..7facbfe 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -65,6 +65,7 @@ static bool has_msr_star; static bool has_msr_hsave_pa; static bool has_msr_tsc_adjust; static bool has_msr_tsc_deadline; +static bool has_msr_feature_control; static bool has_msr_async_pf_en; static bool has_msr_pv_eoi_en; static bool has_msr_misc_enable; @@ -644,6 +645,11 @@ int kvm_arch_init_vcpu(CPUState *cs) qemu_add_vm_change_state_handler(cpu_update_state, env); +c = cpuid_find_entry(cpuid_data.cpuid, 1, 0); +if (c) +has_msr_feature_control = !!(c-ecx CPUID_EXT_VMX) | + !!(c-ecx CPUID_EXT_SMX); + cpuid_data.cpuid.padding = 0; r = kvm_vcpu_ioctl(cs, KVM_SET_CPUID2, cpuid_data); if (r) { @@ -1121,7 +1127,10 @@ static int kvm_put_msrs(X86CPU *cpu, int level) if (hyperv_vapic_recommended()) { kvm_msr_entry_set(msrs[n++], HV_X64_MSR_APIC_ASSIST_PAGE, 0); } -kvm_msr_entry_set(msrs[n++], MSR_IA32_FEATURE_CONTROL, env-msr_ia32_feature_control); +if (has_msr_feature_control) { +kvm_msr_entry_set(msrs[n++], MSR_IA32_FEATURE_CONTROL, + env-msr_ia32_feature_control); +} } if (env-mcg_cap) { int i; @@ -1346,7 +1355,9 @@ static int kvm_get_msrs(X86CPU *cpu) if (has_msr_misc_enable) { msrs[n++].index = MSR_IA32_MISC_ENABLE; } -msrs[n++].index = MSR_IA32_FEATURE_CONTROL; +if (has_msr_feature_control) { +msrs[n++].index = MSR_IA32_FEATURE_CONTROL; +} if (!env-tsc_valid) { msrs[n++].index = MSR_IA32_TSC; @@ -1447,6 +1458,7 @@ static int kvm_get_msrs(X86CPU *cpu) break; case MSR_IA32_FEATURE_CONTROL: env-msr_ia32_feature_control = msrs[i].data; +break; default: if (msrs[i].index = MSR_MC0_CTL msrs[i].index MSR_MC0_CTL + (env-mcg_cap 0xff) * 4) { -- 1.7.1 0001-qemu-kvm-bugfix-for-IA32_FEATURE_CONTROL.patch Description: 0001-qemu-kvm-bugfix-for-IA32_FEATURE_CONTROL.patch
[PATCH] qemu-kvm bugfix for IA32_FEATURE_CONTROL
From 1273f8b2e5464ec987facf9942fd3ccc0b69087e Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Mon, 19 Aug 2013 09:33:30 +0800 Subject: [PATCH] qemu-kvm bugfix for IA32_FEATURE_CONTROL This patch is to fix the bug https://bugs.launchpad.net/qemu-kvm/+bug/1207623 IA32_FEATURE_CONTROL is pointless if not expose VMX or SMX bits to cpuid.1.ecx of vcpu. Current qemu-kvm will error return when kvm_put_msrs or kvm_get_msrs. Signed-off-by: Liu Jinsong jinsong@intel.com --- target-i386/kvm.c | 16 ++-- 1 files changed, 14 insertions(+), 2 deletions(-) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 84ac00a..7facbfe 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -65,6 +65,7 @@ static bool has_msr_star; static bool has_msr_hsave_pa; static bool has_msr_tsc_adjust; static bool has_msr_tsc_deadline; +static bool has_msr_feature_control; static bool has_msr_async_pf_en; static bool has_msr_pv_eoi_en; static bool has_msr_misc_enable; @@ -644,6 +645,11 @@ int kvm_arch_init_vcpu(CPUState *cs) qemu_add_vm_change_state_handler(cpu_update_state, env); +c = cpuid_find_entry(cpuid_data.cpuid, 1, 0); +if (c) +has_msr_feature_control = !!(c-ecx CPUID_EXT_VMX) | + !!(c-ecx CPUID_EXT_SMX); + cpuid_data.cpuid.padding = 0; r = kvm_vcpu_ioctl(cs, KVM_SET_CPUID2, cpuid_data); if (r) { @@ -1121,7 +1127,10 @@ static int kvm_put_msrs(X86CPU *cpu, int level) if (hyperv_vapic_recommended()) { kvm_msr_entry_set(msrs[n++], HV_X64_MSR_APIC_ASSIST_PAGE, 0); } -kvm_msr_entry_set(msrs[n++], MSR_IA32_FEATURE_CONTROL, env-msr_ia32_feature_control); +if (has_msr_feature_control) { +kvm_msr_entry_set(msrs[n++], MSR_IA32_FEATURE_CONTROL, + env-msr_ia32_feature_control); +} } if (env-mcg_cap) { int i; @@ -1346,7 +1355,9 @@ static int kvm_get_msrs(X86CPU *cpu) if (has_msr_misc_enable) { msrs[n++].index = MSR_IA32_MISC_ENABLE; } -msrs[n++].index = MSR_IA32_FEATURE_CONTROL; +if (has_msr_feature_control) { +msrs[n++].index = MSR_IA32_FEATURE_CONTROL; +} if (!env-tsc_valid) { msrs[n++].index = MSR_IA32_TSC; @@ -1447,6 +1458,7 @@ static int kvm_get_msrs(X86CPU *cpu) break; case MSR_IA32_FEATURE_CONTROL: env-msr_ia32_feature_control = msrs[i].data; +break; default: if (msrs[i].index = MSR_MC0_CTL msrs[i].index MSR_MC0_CTL + (env-mcg_cap 0xff) * 4) { -- 1.7.1 0001-qemu-kvm-bugfix-for-IA32_FEATURE_CONTROL.patch Description: 0001-qemu-kvm-bugfix-for-IA32_FEATURE_CONTROL.patch
qemu-kvm log
Hi, I'm recently debugging a qemu-kvm issue. I add some print code like 'fprintf(stderr, ...)', however I fail to see any info at stdio. Anyone can tell me where is qemu-kvm logfile, or, what I need do to record my fprintf info? Thanks, Jinsong-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [Qemu-devel] qemu-kvm log
Avi Kivity wrote: On 09/10/2012 01:44 PM, Liu, Jinsong wrote: Hi, I'm recently debugging a qemu-kvm issue. I add some print code like 'fprintf(stderr, ...)', however I fail to see any info at stdio. Anyone can tell me where is qemu-kvm logfile, or, what I need do to record my fprintf info? If you're running via libvirt, the log is in /var/log/libvirt/qemu. If you're running from the command line and printf()s should end on your terminal. Thanks! I run qemu-kvm from command line, but due to could not initialize SDL, I run qemu command via VNC, like 1). test kvm-unit-tests: qemu-system-x86_64 -device testdev,chardev=testlog -chardev file,id=testlog,path=apic.out -serial stdio -kernel ./x86/apic.flat -cpu host 2). test kvm guest: qemu-system-x86_64 -smp2 -m 512 -hda test.qcow.img -cpu host for case 1, there is some printf info of kvm-unit-tests on terminal but no qemu-kvm printf info; for case 2, no qemu-kvm printf info on terminal; Best Regards, Jinsong-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] KVM: tsc deadline timer works only when hrtimer high resolution configured
Avi Kivity wrote: On 09/07/2012 03:07 PM, Liu, Jinsong wrote: Avi Kivity wrote: diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 148ed66..0e64997 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2210,7 +2210,11 @@ int kvm_dev_ioctl_check_extension(long ext) r = kvm_has_tsc_control; break; case KVM_CAP_TSC_DEADLINE_TIMER: +#ifdef CONFIG_HIGH_RES_TIMERS r = boot_cpu_has(X86_FEATURE_TSC_DEADLINE_TIMER); +#else + r = 0; +#endif break; I prefer a patch making kvm for x86 depend on hrtimers. kvm already provides a high resolution timer to the guest in the local apic, backing it with the jiffies event source will likely cause some guests to malfunction. Yep, I did a draft test for kvm lapic timer, it also worked fail when CONFIG_HIGH_RES_TIMERS disabled. Attached is the udpated patch. Thanks, Jinsong From 64d0458ec50a7d6917adf1e9735ba6e6ae6024ad Mon Sep 17 00:00:00 2001 From: Liu, Jinsong jinsong@intel.com Date: Sat, 8 Sep 2012 03:32:31 +0800 Subject: [PATCH] KVM: select HIGH_RES_TIMERS when KVM enabled This is for 2 reasons: 1. it's pointless for kvm lapic timer and tsc deadline timer when kernel hrtimer not configured as high resolution, since that would be not accurate based on wheel; 2. kvm lapic timer and tsc deadline timer based on hrtimer, setting a leftmost node to rb tree and then do hrtimer reprogram. If hrtimer not configured as high resolution, hrtimer_enqueue_reprogram do nothing and then make kvm lapic timer and tsc deadline timer fail. Signed-off-by: Liu, Jinsong jinsong@intel.com --- arch/x86/kvm/Kconfig |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index a28f338..5f861ca 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -24,6 +24,8 @@ config KVM depends on PCI # for TASKSTATS/TASK_DELAY_ACCT: depends on NET +# for HIGH_RES_TIMERS +depends on !ARCH_USES_GETTIMEOFFSET select PREEMPT_NOTIFIERS select MMU_NOTIFIER select ANON_INODES @@ -37,6 +39,8 @@ config KVM select TASK_DELAY_ACCT select PERF_EVENTS select HAVE_KVM_MSI +select GENERIC_CLOCKEVENTS +select HIGH_RES_TIMERS hrtimers is an intrusive feature, I don't think we should force-enable it. Please change it to a depends on. Hmm, if it changed as config KVM depends on HIGH_RES_TIMERS The item 'Kernel-based Virtual Machine (KVM) support (NEW)' even didn't appear to user when make menuconfig (when HIGH_RES_TIMERS disable) Is it good? I just have a little concern here:) Thanks, Jinsong-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] KVM: tsc deadline timer works only when hrtimer high resolution configured
Avi Kivity wrote: On 09/09/2012 05:54 PM, Liu, Jinsong wrote: hrtimers is an intrusive feature, I don't think we should force-enable it. Please change it to a depends on. Hmm, if it changed as config KVM depends on HIGH_RES_TIMERS The item 'Kernel-based Virtual Machine (KVM) support (NEW)' even didn't appear to user when make menuconfig (when HIGH_RES_TIMERS disable) Is it good? I just have a little concern here:) It's not good, but that's what we have. It's okay to force-enable low-impact features (like preempt notifies). hrimers, on the other hand, change kernel behaviour quite deeply. Maybe over time someone will fix the config tools to unhide features that can be enabled by turning on a dependency. OK, updated as attached. Thanks, Jinsong === From e6c2a80d3111cc6fb992d78b242619706d99bc6b Mon Sep 17 00:00:00 2001 From: Liu, Jinsong jinsong@intel.com Date: Mon, 10 Sep 2012 06:55:39 +0800 Subject: [PATCH] KVM: KVM enable depends on HIGH_RES_TIMERS KVM lapic timer and tsc deadline timer based on hrtimer, setting a leftmost node to rb tree and then do hrtimer reprogram. If hrtimer not configured as high resolution, hrtimer_enqueue_reprogram do nothing and then make kvm lapic timer and tsc deadline timer fail. Signed-off-by: Liu, Jinsong jinsong@intel.com --- arch/x86/kvm/Kconfig |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index a28f338..65657ec 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -20,6 +20,7 @@ if VIRTUALIZATION config KVM tristate Kernel-based Virtual Machine (KVM) support depends on HAVE_KVM + depends on HIGH_RES_TIMERS # for device assignment: depends on PCI # for TASKSTATS/TASK_DELAY_ACCT: -- 1.7.1 0001-KVM-KVM-enable-depends-on-HIGH_RES_TIMERS.patch Description: 0001-KVM-KVM-enable-depends-on-HIGH_RES_TIMERS.patch
RE: [PATCH] KVM: tsc deadline timer works only when hrtimer high resolution configured
Avi Kivity wrote: diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 148ed66..0e64997 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2210,7 +2210,11 @@ int kvm_dev_ioctl_check_extension(long ext) r = kvm_has_tsc_control; break; case KVM_CAP_TSC_DEADLINE_TIMER: +#ifdef CONFIG_HIGH_RES_TIMERS r = boot_cpu_has(X86_FEATURE_TSC_DEADLINE_TIMER); +#else +r = 0; +#endif break; I prefer a patch making kvm for x86 depend on hrtimers. kvm already provides a high resolution timer to the guest in the local apic, backing it with the jiffies event source will likely cause some guests to malfunction. Yep, I did a draft test for kvm lapic timer, it also worked fail when CONFIG_HIGH_RES_TIMERS disabled. Attached is the udpated patch. Thanks, Jinsong From 64d0458ec50a7d6917adf1e9735ba6e6ae6024ad Mon Sep 17 00:00:00 2001 From: Liu, Jinsong jinsong@intel.com Date: Sat, 8 Sep 2012 03:32:31 +0800 Subject: [PATCH] KVM: select HIGH_RES_TIMERS when KVM enabled This is for 2 reasons: 1. it's pointless for kvm lapic timer and tsc deadline timer when kernel hrtimer not configured as high resolution, since that would be not accurate based on wheel; 2. kvm lapic timer and tsc deadline timer based on hrtimer, setting a leftmost node to rb tree and then do hrtimer reprogram. If hrtimer not configured as high resolution, hrtimer_enqueue_reprogram do nothing and then make kvm lapic timer and tsc deadline timer fail. Signed-off-by: Liu, Jinsong jinsong@intel.com --- arch/x86/kvm/Kconfig |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index a28f338..5f861ca 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -24,6 +24,8 @@ config KVM depends on PCI # for TASKSTATS/TASK_DELAY_ACCT: depends on NET + # for HIGH_RES_TIMERS + depends on !ARCH_USES_GETTIMEOFFSET select PREEMPT_NOTIFIERS select MMU_NOTIFIER select ANON_INODES @@ -37,6 +39,8 @@ config KVM select TASK_DELAY_ACCT select PERF_EVENTS select HAVE_KVM_MSI + select GENERIC_CLOCKEVENTS + select HIGH_RES_TIMERS ---help--- Support hosting fully virtualized guest machines using hardware virtualization extensions. You will need a fairly recent -- 1.7.1 0001-KVM-select-HIGH_RES_TIMERS-when-KVM-enabled.patch Description: 0001-KVM-select-HIGH_RES_TIMERS-when-KVM-enabled.patch
[PATCH] KVM: tsc deadline timer works only when hrtimer high resolution configured
From 728a17e2de591b557c3c8ba31076b4bf2ca5ab42 Mon Sep 17 00:00:00 2001 From: Liu, Jinsong jinsong@intel.com Date: Wed, 5 Sep 2012 03:18:15 +0800 Subject: [PATCH] KVM: tsc deadline timer works only when hrtimer high resolution configured This is for 2 reasons: 1. it's pointless to enable tsc deadline timer to guest when kernel hrtimer not configured as high resolution, since that would be un-precise based on wheel; 2. tsc deadline timer based on hrtimer, setting a leftmost node to rb tree and then do hrtimer reprogram. If hrtimer not configured as high resolution, hrtimer_enqueue_reprogram do nothing and would make tsc deadline timer fail. Signed-off-by: Liu, Jinsong jinsong@intel.com --- arch/x86/kvm/x86.c |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 148ed66..0e64997 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2210,7 +2210,11 @@ int kvm_dev_ioctl_check_extension(long ext) r = kvm_has_tsc_control; break; case KVM_CAP_TSC_DEADLINE_TIMER: +#ifdef CONFIG_HIGH_RES_TIMERS r = boot_cpu_has(X86_FEATURE_TSC_DEADLINE_TIMER); +#else + r = 0; +#endif break; default: r = 0; -- 1.7.1 0001-KVM-tsc-deadline-timer-works-only-when-hrtimer-high-.patch Description: 0001-KVM-tsc-deadline-timer-works-only-when-hrtimer-high-.patch
[RFC PATCH] Expose tsc deadline timer feature to guest
Eduardo, Jan, Andreas As we sync 3 months ago, I wait until qemu1.1 done, then re-write patch based on qemu1.1. Now it's time to re-write my patch based on qemu1.1. Attached is a RFC patch for exposing tsc deadline timer to guest. I have checked current qemu1.1 code, and read some emails regarding to cpuid exposing these days. However, I think I may ignore something (so many discussion :-), so if you think anything wrong, please point out to me. Thanks, Jinsong From 8b5b003f6f8834d2d5d71e18bb47b7f089bc4928 Mon Sep 17 00:00:00 2001 From: Liu, Jinsong jinsong@intel.com Date: Tue, 3 Jul 2012 02:35:10 +0800 Subject: [PATCH] Expose tsc deadline timer feature to guest This patch exposes tsc deadline timer feature to guest if 1). in-kernel irqchip is used, and 2). kvm has emulated tsc deadline timer, and 3). user authorize the feature exposing via -cpu or +/- tsc-deadline Signed-off-by: Liu, Jinsong jinsong@intel.com --- target-i386/cpu.h |1 + target-i386/kvm.c |5 + 2 files changed, 6 insertions(+), 0 deletions(-) diff --git a/target-i386/cpu.h b/target-i386/cpu.h index 79cc640..d1a4a04 100644 --- a/target-i386/cpu.h +++ b/target-i386/cpu.h @@ -400,6 +400,7 @@ #define CPUID_EXT_X2APIC (1 21) #define CPUID_EXT_MOVBE(1 22) #define CPUID_EXT_POPCNT (1 23) +#define CPUID_EXT_TSC_DEADLINE_TIMER (1 24) #define CPUID_EXT_XSAVE(1 26) #define CPUID_EXT_OSXSAVE (1 27) #define CPUID_EXT_HYPERVISOR (1 31) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 0d0d8f6..52b577f 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -361,8 +361,13 @@ int kvm_arch_init_vcpu(CPUX86State *env) env-cpuid_features = kvm_arch_get_supported_cpuid(s, 1, 0, R_EDX); i = env-cpuid_ext_features CPUID_EXT_HYPERVISOR; +j = env-cpuid_ext_features CPUID_EXT_TSC_DEADLINE_TIMER; env-cpuid_ext_features = kvm_arch_get_supported_cpuid(s, 1, 0, R_ECX); env-cpuid_ext_features |= i; +if (j kvm_irqchip_in_kernel() +kvm_check_extension(s, KVM_CAP_TSC_DEADLINE_TIMER)) { +env-cpuid_ext_features |= CPUID_EXT_TSC_DEADLINE_TIMER; +} env-cpuid_ext2_features = kvm_arch_get_supported_cpuid(s, 0x8001, 0, R_EDX); -- 1.7.1 0001-Expose-tsc-deadline-timer-feature-to-guest.patch Description: 0001-Expose-tsc-deadline-timer-feature-to-guest.patch
RE: [Qemu-devel] [PATCH 2/2] Expose tsc deadline timer cpuid to guest
Eduardo, Jan I will update tsc deadline timer patch (at qemu-kvm side) recently. Have you made a final agreement of the issue 'KVM_CAP_TSC_DEADLINE_TIMER' vs. 'GET_SUPPORTED_CPUID'? Thanks, Jinsong Eduardo Habkost wrote: (CCing Andre Przywara, in case he can help to clarify what's the expected meaning of -cpu host) On Tue, Apr 24, 2012 at 06:06:55PM +0200, Jan Kiszka wrote: On 2012-04-23 22:02, Eduardo Habkost wrote: On Mon, Apr 23, 2012 at 06:31:25PM +0200, Jan Kiszka wrote: However, that was how I interpreted this GET_SUPPORTED_CPUID. In fact, it is used as kernel or hardware does not _prevent_ already. And in that sense, it's ok to enable even features that are not in kernel/hardware hands. We should point out this fact in the documentation. I see GET_SUPPORTED_CPUID as just a what userspace can enable because the kernel and the hardware support it (= don't prevent it), as long as userspace has the required support (meaning A+B). It's a bit like KVM_CHECK_EXTENSION, but with the nice feature that that the capabilities map directly to CPUID bits. So, it's not clear to me: now you are OK with adding TSC_DEADLINE to GET_SUPPORTED_CPUID? But we still have the issue of -cpu host not knowing what can be safely enabled (without userspace feature-specific setup code), or not. Do you have any suggestion for that? Avi, do you have any suggestion? First of all, I bet this was already broken with the introduction of x2apic. So TSC deadline won't make it worse. I guess we need to address this in userspace, first by masking those features out, later by actually emulating them. I am not sure I understand what you are proposing. Let me explain the use case I am thinking about: - Feature FOO is of type (A) (e.g. just a new instruction set that doesn't require additional userspace support) - User has a Qemu vesion that doesn't know anything about feature FOO - User gets a new CPU that supports feature FOO - User gets a new kernel that supports feature FOO (i.e. has FOO in GET_SUPPORTED_CPUID) - User does _not_ upgrade Qemu. - User expects to get feature FOO enabled if using -cpu host, without upgrading Qemu. The problem here is: to support the above use-case, userspace need a probing mechanism that can differentiate _new_ (previously unknown) features that are in group (A) (safe to blindly enable) from features that are in group (B) (that can't be enabled without an userspace upgrade). In short, it becomes a problem if we consider the following case: - Feature BAR is of type (B) (it can't be enabled without extra userspace support) - User has a Qemu version that doesn't know anything about feature BAR - User gets a new CPU that supports feature BAR - User gets a new kernel that supports feature BAR (i.e. has BAR in GET_SUPPORTED_CPUID) - User does _not_ upgrade Qemu. - User simply shouldn't get feature BAR enabled, even if using -cpu host, otherwise Qemu would break. If userspace always limited itself to features it knows about, it would be really easy to implement the feature without any new probing mechanism from the kernel. But that's not how I think users expect -cpu host to work. Maybe I am wrong, I don't know. I am CCing Andre, who introduced the -cpu host feature, in case he can explain what's the expected semantics on the cases above. And I still don't know the answer to: - How to precisely define the groups (A) and (B)? - requires additional code only if migration is required qualifies as (B) or (A)? Re: documentation, isn't the following paragraph (already present on api.txt) sufficient? The entries returned are the host cpuid as returned by the cpuid instruction, with unknown or unsupported features masked out. Some features (for example, x2apic), may not be present in the host cpu, but are exposed by kvm if it can emulate them efficiently. That suggests such features are always emulated - which is not true. They are either emulated, or nothing _prevents_ their emulation by user space. Well... it's a bit more complicated than that: the current semantics are a bit more than doesn't prevent, as in theory every single feature can be emulated by userspace, without any help from the kernel. So, if doesn't prevent were the only criteria, the kernel would set every single feature bit on GET_SUPPORTED_CPUID, making it not very useful. At least in the case of x2apic, the kernel is using GET_SUPPORTED_CPUID to expose a _capability_ too: when x2apic is present on GET_SUPPORTED_CPUID, userspace knows that in addition to not preventing the feature from being enabled, the kernel is now able to emulate x2apic (if proper setup is made by userspace). A kernel that can't emulate x2apic (even if userspace was allowed to emulate it completely in userspace) would never have x2apic enabled on GET_SUPPORTED_CPUID. Like I said previously, in the end GET_SUPPORTED_CPUID is
RE: [Qemu-devel] [PATCH 2/2] Expose tsc deadline timer cpuid to guest
Eduardo Habkost wrote: On Thu, Jun 14, 2012 at 07:02:03PM +, Liu, Jinsong wrote: Eduardo, Jan I will update tsc deadline timer patch (at qemu-kvm side) recently. Have you made a final agreement of the issue 'KVM_CAP_TSC_DEADLINE_TIMER' vs. 'GET_SUPPORTED_CPUID'? I don't think there's a final agreement, but I was convinced later that it's probably better to _not_ have TSC-deadline on GET_SUPPORTED_CPUID, at least not by default. Even if this is changed in the future, it's a good idea to make qemu support the KVM_CAP_TSC_DEADLINE_TIMER method if running on older kernels, anyway. OK, so I will coding based on current KVM_CAP_TSC_DEADLINE_TIMER method. Thanks for clarifying! Eduardo Habkost wrote: (CCing Andre Przywara, in case he can help to clarify what's the expected meaning of -cpu host) On Tue, Apr 24, 2012 at 06:06:55PM +0200, Jan Kiszka wrote: On 2012-04-23 22:02, Eduardo Habkost wrote: On Mon, Apr 23, 2012 at 06:31:25PM +0200, Jan Kiszka wrote: However, that was how I interpreted this GET_SUPPORTED_CPUID. In fact, it is used as kernel or hardware does not _prevent_ already. And in that sense, it's ok to enable even features that are not in kernel/hardware hands. We should point out this fact in the documentation. I see GET_SUPPORTED_CPUID as just a what userspace can enable because the kernel and the hardware support it (= don't prevent it), as long as userspace has the required support (meaning A+B). It's a bit like KVM_CHECK_EXTENSION, but with the nice feature that that the capabilities map directly to CPUID bits. So, it's not clear to me: now you are OK with adding TSC_DEADLINE to GET_SUPPORTED_CPUID? But we still have the issue of -cpu host not knowing what can be safely enabled (without userspace feature-specific setup code), or not. Do you have any suggestion for that? Avi, do you have any suggestion? First of all, I bet this was already broken with the introduction of x2apic. So TSC deadline won't make it worse. I guess we need to address this in userspace, first by masking those features out, later by actually emulating them. I am not sure I understand what you are proposing. Let me explain the use case I am thinking about: - Feature FOO is of type (A) (e.g. just a new instruction set that doesn't require additional userspace support) - User has a Qemu vesion that doesn't know anything about feature FOO - User gets a new CPU that supports feature FOO - User gets a new kernel that supports feature FOO (i.e. has FOO in GET_SUPPORTED_CPUID) - User does _not_ upgrade Qemu. - User expects to get feature FOO enabled if using -cpu host, without upgrading Qemu. The problem here is: to support the above use-case, userspace need a probing mechanism that can differentiate _new_ (previously unknown) features that are in group (A) (safe to blindly enable) from features that are in group (B) (that can't be enabled without an userspace upgrade). In short, it becomes a problem if we consider the following case: - Feature BAR is of type (B) (it can't be enabled without extra userspace support) - User has a Qemu version that doesn't know anything about feature BAR - User gets a new CPU that supports feature BAR - User gets a new kernel that supports feature BAR (i.e. has BAR in GET_SUPPORTED_CPUID) - User does _not_ upgrade Qemu. - User simply shouldn't get feature BAR enabled, even if using -cpu host, otherwise Qemu would break. If userspace always limited itself to features it knows about, it would be really easy to implement the feature without any new probing mechanism from the kernel. But that's not how I think users expect -cpu host to work. Maybe I am wrong, I don't know. I am CCing Andre, who introduced the -cpu host feature, in case he can explain what's the expected semantics on the cases above. And I still don't know the answer to: - How to precisely define the groups (A) and (B)? - requires additional code only if migration is required qualifies as (B) or (A)? Re: documentation, isn't the following paragraph (already present on api.txt) sufficient? The entries returned are the host cpuid as returned by the cpuid instruction, with unknown or unsupported features masked out. Some features (for example, x2apic), may not be present in the host cpu, but are exposed by kvm if it can emulate them efficiently. That suggests such features are always emulated - which is not true. They are either emulated, or nothing _prevents_ their emulation by user space. Well... it's a bit more complicated than that: the current semantics are a bit more than doesn't prevent, as in theory every single feature can be emulated by userspace, without any help from the kernel. So, if doesn't prevent were the only criteria, the kernel would set every single feature bit on GET_SUPPORTED_CPUID, making it not very useful. At least in the case of x2apic
RE: [Qemu-devel] [PATCH 2/2] Expose tsc deadline timer cpuid to guest
Eduardo Habkost wrote: On Fri, Mar 09, 2012 at 09:52:29PM +0100, Jan Kiszka wrote: On 2012-03-09 20:09, Liu, Jinsong wrote: Jan Kiszka wrote: On 2012-03-09 19:27, Liu, Jinsong wrote: Jan Kiszka wrote: On 2012-03-06 08:49, Liu, Jinsong wrote: Jan, Any comments? I feel some confused about your point 'disable cpuid feature for older machine types by default': are you planning a common approach for this common issue, or, you just ask me a specific solution for the tsc deadline timer case? I think a generic solution for this can be as simple as passing a feature exclusion mask to cpu_init. You could simple append a string of -feature1,-feature2 to the cpu model that is specified on creation. And that string could be defined in the compat machine descriptions. Does this make sense? Jan, to prevent misunderstanding, I elaborate my understanding of your points below (if any misunderstanding please point out to me): = Your target is, to migrate from A(old qemu) to B(new qemu) by 1. at A: qemu-version-A [-cpu whatever] // currently the default machine type is pc-A 2. at B: qemu-version-B -machine pc-A [-cpu whatever] -feature1 -feature2 B run new qemu-version-B (w/ new features 'feature1' and 'feature2'), but when B runs w/ compat '-machine pc-A', vm should not see 'feature1' and 'feature2', so commandline append string to cpu model '-cpu whatever -feature1 -feature2' to hidden new feature1 and feature2 to vm, hence vm can see same cpuid features (at B) as those at A (which means, no feature1, no feature2) = If my understanding of your thoughts is right, I think currently qemu has satisfied your target, code refer to pc_cpus_init(cpu_model) .. cpu_init(cpu_model) -- cpu_x86_register(*env, cpu_model) -- cpu_x86_find_by_name(*def, cpu_model) // parse '+/- features', generate feature masks plus_features... // and minus_features...(this is feature exclusion masks you want) I think your point 'define in the compat machine description' is unnecessary. The user would have to specify the new feature as exclusions *manually* on the command line if -machine pc-A doesn't inject them *automatically*. So it is necessary to enhance qemu in this regard. ... You suggest 'append a string of -feature1,-feature2 to the cpu model that is specified on creation' at your last email. Could you tell me other way user exclude features? I only know qemu command line :-( I was thinking of something like diff --git a/hw/boards.h b/hw/boards.h index 667177d..2bae071 100644 --- a/hw/boards.h +++ b/hw/boards.h @@ -28,6 +28,7 @@ typedef struct QEMUMachine { int is_default; const char *default_machine_opts; GlobalProperty *compat_props; +const char *compat_cpu_features; struct QEMUMachine *next; } QEMUMachine; diff --git a/hw/pc.c b/hw/pc.c index bb9867b..4d11559 100644 --- a/hw/pc.c +++ b/hw/pc.c @@ -949,8 +949,9 @@ static CPUState *pc_new_cpu(const char *cpu_model) return env; } -void pc_cpus_init(const char *cpu_model) +void pc_cpus_init(const char *cpu_model, const char *append_features) { +char *model_and_features; int i; /* init CPUs */ @@ -961,10 +962,13 @@ void pc_cpus_init(const char *cpu_model) cpu_model = qemu32; #endif } +model_and_features = g_strconcat(cpu_model, ,, append_features, NULL); for(i = 0; i smp_cpus; i++) { -pc_new_cpu(cpu_model); +pc_new_cpu(model_and_features); } + +g_free(model_and_features); } void pc_memory_init(MemoryRegion *system_memory, However, getting machine.compat_cpu_features to pc_cpus_init is rather ugly. And we will have CPU devices with real properties soon. Then the compat feature string could be passed that way, without changing any machine init function. What if one cpudef had the wrong flags set but another cpudef was correct, and we had to fix it on Qemu 1.1 for only one model? What if the user _really_ wanted to edit the config file to add or remove a given flag? I think the best approach would be: - Having versioned CPU model names; - Specifying per-machine-type aliases. See also the [libvirt] Modern CPU models cannot be used with libvirt for related discussion. The config file would look like: [cpudef] name = Westmere-1.0 features = [...] # no tsc-deadline [...] [cpudef] name = Westmere-1.1 # so we don't have to copy everything from Westmere-1.0 manually: base_cpudef = Westemere-1.0 # we could simply copy extend: features = [...] tsc-deadline # or, even better, if we had a append mechanism. e.g.: #features_append = tsc-deadline Then, on the machine-type table: - Machine-types pc-1.0 and older would have: .cpudef_aliases = { {Westmere, Westmere-1.0}, [...] } - Machine-type pc-1.1 would have: .cpudef_aliases = { {Westmere, Westmere-1.1
RE: [Qemu-devel] [PATCH 2/2] Expose tsc deadline timer cpuid to guest
Eduardo Habkost wrote: On Fri, Mar 23, 2012 at 03:49:27AM +, Liu, Jinsong wrote: Eduardo Habkost wrote: [1] From Documentation/virtual/kvm/api.txt: KVM_GET_SUPPORTED_CPUID [...] This ioctl returns x86 cpuid features which are supported by both the hardware and kvm. Userspace can use the information returned by this ioctl to construct cpuid information (for KVM_SET_CPUID2) that is consistent with hardware, kernel, and userspace capabilities, and with ^^ user requirements (for example, the user may wish to constrain cpuid to emulate older hardware, or for feature consistency across a cluster). The fixbug patch is implemented by Jan and Avi, I reply per my understanding. No problem. I hope Jan or Avi can clarify this. I think for tsc deadline timer feature, KVM_CAP_TSC_DEADLINE_TIMER is slightly better than KVM_GET_SUPPORTED_CPUID. If use KVM_GET_SUPPORTED_CPUID, it means tsc deadline features bind to host cpuid, while it fact it could be pure software emulated by kvm (though currently it implemented as bound to hareware). For the sake of extension, it choose KVM_CAP_TSC_DEADLINE_TIMER. There's no requirement for GET_SUPPORTED_CPUID to be a subset of the host CPU features. If KVM can completely emulate the feature by software, then it can return the feature on GET_SUPPORTED_CPUID even if the host CPU doesn't have the feature. That's the case for x2apic, for example (see commit 0d1de2d901f4ba0972a3886496a44fb1d3300dbd). Jan/Avi, Could you elaborate more your thought? Thanks, Jinsong-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [Qemu-devel] [PATCH 2/2] Expose tsc deadline timer cpuid to guest
Eduardo Habkost wrote: On Tue, Mar 20, 2012 at 12:53:57PM +, Liu, Jinsong wrote: Rik van Riel wrote: On 03/09/2012 01:27 PM, Liu, Jinsong wrote: As for 'tsc deadline' feature exposing, my patch (as attached) just obey qemu general cpuid exposing method, and also satisfied your target I think. One question. Why is TSC_DEADLINE not exposed in the cpuid allowed feature bits in do_cpuid_ent() in arch/x86/kvm/x86.c ? /* cpuid 1.ecx */ const u32 kvm_supported_word4_x86_features = F(XMM3) | F(PCLMULQDQ) | 0 /* DTES64, MONITOR */ | 0 /* DS-CPL, VMX, SMX, EST */ | 0 /* TM2 */ | F(SSSE3) | 0 /* CNXT-ID */ | 0 /* Reserved */ | F(FMA) | F(CX16) | 0 /* xTPR Update, PDCM */ | 0 /* Reserved, DCA */ | F(XMM4_1) | F(XMM4_2) | F(X2APIC) | F(MOVBE) | F(POPCNT) | 0 /* Reserved*/ | F(AES) | F(XSAVE) | 0 /* OSXSAVE */ | F(AVX) | F(F16C) | F(RDRAND); Would it make sense to expose F(TSC_DEADLINE) above? Or is there something truly special about tsc deadline that means it should be different from everything else? Because the feature depends on KVM_CREATE_IRQCHIP, which we cannot guarantee will be called, we expose it via a KVM_CAP_TSC_DEADLINE_TIMER and not KVM_GET_SUPPORTED_CPUID. We have many other features that depend on proper support from userspace otherwise they wouldn't work, but are listed on GET_SUPPORTED_CPUID, don't we? Why is TSC-deadline special? GET_SUPPORTED_CPUID just means KVM supports it as long as userspace supports it too and enables it, it doesn't mean CPUID bit that will be enabled by default[1]. Refer changeset 4d25a066b69fb749a39d0d4c610689dd765a0b0e. That changeset was necessary because the kernel was doing this on update_cpu if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL best-function == 0x1) { best-ecx |= bit(X86_FEATURE_TSC_DEADLINE_TIMER); And that was really wrong, because it enabled the bit unconditionally. But I don't understand why KVM_CAP_TSC_DEADLINE_TIMER was created if we already have KVM_GET_SUPPORTED_CPUID to tell userspace which bits are supported by KVM. Yes, exactly. That's why we need this patch. [1] From Documentation/virtual/kvm/api.txt: KVM_GET_SUPPORTED_CPUID [...] This ioctl returns x86 cpuid features which are supported by both the hardware and kvm. Userspace can use the information returned by this ioctl to construct cpuid information (for KVM_SET_CPUID2) that is consistent with hardware, kernel, and userspace capabilities, and with ^^ user requirements (for example, the user may wish to constrain cpuid to emulate older hardware, or for feature consistency across a cluster). The fixbug patch is implemented by Jan and Avi, I reply per my understanding. I think for tsc deadline timer feature, KVM_CAP_TSC_DEADLINE_TIMER is slightly better than KVM_GET_SUPPORTED_CPUID. If use KVM_GET_SUPPORTED_CPUID, it means tsc deadline features bind to host cpuid, while it fact it could be pure software emulated by kvm (though currently it implemented as bound to hareware). For the sake of extension, it choose KVM_CAP_TSC_DEADLINE_TIMER. Thanks, Jinsong -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [Qemu-devel] [PATCH 2/2] Expose tsc deadline timer cpuid to guest
Jan Kiszka wrote: On 2012-03-09 20:09, Liu, Jinsong wrote: Jan Kiszka wrote: On 2012-03-09 19:27, Liu, Jinsong wrote: Jan Kiszka wrote: On 2012-03-06 08:49, Liu, Jinsong wrote: Jan, Any comments? I feel some confused about your point 'disable cpuid feature for older machine types by default': are you planning a common approach for this common issue, or, you just ask me a specific solution for the tsc deadline timer case? I think a generic solution for this can be as simple as passing a feature exclusion mask to cpu_init. You could simple append a string of -feature1,-feature2 to the cpu model that is specified on creation. And that string could be defined in the compat machine descriptions. Does this make sense? Jan, to prevent misunderstanding, I elaborate my understanding of your points below (if any misunderstanding please point out to me): = Your target is, to migrate from A(old qemu) to B(new qemu) by 1. at A: qemu-version-A [-cpu whatever] // currently the default machine type is pc-A 2. at B: qemu-version-B -machine pc-A [-cpu whatever] -feature1 -feature2 B run new qemu-version-B (w/ new features 'feature1' and 'feature2'), but when B runs w/ compat '-machine pc-A', vm should not see 'feature1' and 'feature2', so commandline append string to cpu model '-cpu whatever -feature1 -feature2' to hidden new feature1 and feature2 to vm, hence vm can see same cpuid features (at B) as those at A (which means, no feature1, no feature2) = If my understanding of your thoughts is right, I think currently qemu has satisfied your target, code refer to pc_cpus_init(cpu_model) .. cpu_init(cpu_model) -- cpu_x86_register(*env, cpu_model) -- cpu_x86_find_by_name(*def, cpu_model) // parse '+/- features', generate feature masks plus_features... // and minus_features...(this is feature exclusion masks you want) I think your point 'define in the compat machine description' is unnecessary. The user would have to specify the new feature as exclusions *manually* on the command line if -machine pc-A doesn't inject them *automatically*. So it is necessary to enhance qemu in this regard. ... You suggest 'append a string of -feature1,-feature2 to the cpu model that is specified on creation' at your last email. Could you tell me other way user exclude features? I only know qemu command line :-( I was thinking of something like diff --git a/hw/boards.h b/hw/boards.h index 667177d..2bae071 100644 --- a/hw/boards.h +++ b/hw/boards.h @@ -28,6 +28,7 @@ typedef struct QEMUMachine { int is_default; const char *default_machine_opts; GlobalProperty *compat_props; +const char *compat_cpu_features; struct QEMUMachine *next; } QEMUMachine; diff --git a/hw/pc.c b/hw/pc.c index bb9867b..4d11559 100644 --- a/hw/pc.c +++ b/hw/pc.c @@ -949,8 +949,9 @@ static CPUState *pc_new_cpu(const char *cpu_model) return env; } -void pc_cpus_init(const char *cpu_model) +void pc_cpus_init(const char *cpu_model, const char *append_features) { +char *model_and_features; int i; /* init CPUs */ @@ -961,10 +962,13 @@ void pc_cpus_init(const char *cpu_model) cpu_model = qemu32; #endif } +model_and_features = g_strconcat(cpu_model, ,, append_features, NULL); for(i = 0; i smp_cpus; i++) { -pc_new_cpu(cpu_model); +pc_new_cpu(model_and_features); } + +g_free(model_and_features); } void pc_memory_init(MemoryRegion *system_memory, However, getting machine.compat_cpu_features to pc_cpus_init is rather ugly. And we will have CPU devices with real properties soon. Then the compat feature string could be passed that way, without changing any machine init function. Andreas, do you expect CPU devices to be ready for qemu 1.1? We would need them to pass a feature exclusion mask from machine.compat_props to the (x86) CPU init code. cpu devices is just another format of current cpu_model. It helps nothing to our problem. Again, the point is, by what method the feature exclusion mask can be generated, if user not give hint manually? Thanks, Jinsong Well, given that introducing some intermediate solution for this would be complex and hacky and that there is a way to configure tsc_deadline for old machines away, though only an explicit one, I could live with postponing the feature mask after the CPU device conversion. But the last word will have the maintainers. Jan -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [Qemu-devel] [PATCH 2/2] Expose tsc deadline timer cpuid to guest
Jan Kiszka wrote: On 2012-03-06 08:49, Liu, Jinsong wrote: Jan, Any comments? I feel some confused about your point 'disable cpuid feature for older machine types by default': are you planning a common approach for this common issue, or, you just ask me a specific solution for the tsc deadline timer case? I think a generic solution for this can be as simple as passing a feature exclusion mask to cpu_init. You could simple append a string of -feature1,-feature2 to the cpu model that is specified on creation. And that string could be defined in the compat machine descriptions. Does this make sense? Jan, to prevent misunderstanding, I elaborate my understanding of your points below (if any misunderstanding please point out to me): = Your target is, to migrate from A(old qemu) to B(new qemu) by 1. at A: qemu-version-A [-cpu whatever] // currently the default machine type is pc-A 2. at B: qemu-version-B -machine pc-A [-cpu whatever] -feature1 -feature2 B run new qemu-version-B (w/ new features 'feature1' and 'feature2'), but when B runs w/ compat '-machine pc-A', vm should not see 'feature1' and 'feature2', so commandline append string to cpu model '-cpu whatever -feature1 -feature2' to hidden new feature1 and feature2 to vm, hence vm can see same cpuid features (at B) as those at A (which means, no feature1, no feature2) = If my understanding of your thoughts is right, I think currently qemu has satisfied your target, code refer to pc_cpus_init(cpu_model) .. cpu_init(cpu_model) -- cpu_x86_register(*env, cpu_model) -- cpu_x86_find_by_name(*def, cpu_model) // parse '+/- features', generate feature masks plus_features... // and minus_features...(this is feature exclusion masks you want) I think your point 'define in the compat machine description' is unnecessary. As for 'tsc deadline' feature exposing, my patch (as attached) just obey qemu general cpuid exposing method, and also satisfied your target I think. Thanks, Jinsong 0001-Expose-tsc-deadline-timer-feature-to-guest.patch Description: 0001-Expose-tsc-deadline-timer-feature-to-guest.patch
RE: [Qemu-devel] [PATCH 2/2] Expose tsc deadline timer cpuid to guest
Jan Kiszka wrote: On 2012-03-09 19:27, Liu, Jinsong wrote: Jan Kiszka wrote: On 2012-03-06 08:49, Liu, Jinsong wrote: Jan, Any comments? I feel some confused about your point 'disable cpuid feature for older machine types by default': are you planning a common approach for this common issue, or, you just ask me a specific solution for the tsc deadline timer case? I think a generic solution for this can be as simple as passing a feature exclusion mask to cpu_init. You could simple append a string of -feature1,-feature2 to the cpu model that is specified on creation. And that string could be defined in the compat machine descriptions. Does this make sense? Jan, to prevent misunderstanding, I elaborate my understanding of your points below (if any misunderstanding please point out to me): = Your target is, to migrate from A(old qemu) to B(new qemu) by 1. at A: qemu-version-A [-cpu whatever] // currently the default machine type is pc-A 2. at B: qemu-version-B -machine pc-A [-cpu whatever] -feature1 -feature2 B run new qemu-version-B (w/ new features 'feature1' and 'feature2'), but when B runs w/ compat '-machine pc-A', vm should not see 'feature1' and 'feature2', so commandline append string to cpu model '-cpu whatever -feature1 -feature2' to hidden new feature1 and feature2 to vm, hence vm can see same cpuid features (at B) as those at A (which means, no feature1, no feature2) = If my understanding of your thoughts is right, I think currently qemu has satisfied your target, code refer to pc_cpus_init(cpu_model) .. cpu_init(cpu_model) -- cpu_x86_register(*env, cpu_model) -- cpu_x86_find_by_name(*def, cpu_model) // parse '+/- features', generate feature masks plus_features... // and minus_features...(this is feature exclusion masks you want) I think your point 'define in the compat machine description' is unnecessary. The user would have to specify the new feature as exclusions *manually* on the command line if -machine pc-A doesn't inject them *automatically*. So it is necessary to enhance qemu in this regard. ... You suggest 'append a string of -feature1,-feature2 to the cpu model that is specified on creation' at your last email. Could you tell me other way user exclude features? I only know qemu command line :-( Thanks, Jinsong-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [Qemu-devel] [PATCH 2/2] Expose tsc deadline timer cpuid to guest
Liu, Jinsong wrote: Jan Kiszka wrote: On 2012-03-06 08:49, Liu, Jinsong wrote: Jan, Any comments? I feel some confused about your point 'disable cpuid feature for older machine types by default': are you planning a common approach for this common issue, or, you just ask me a specific solution for the tsc deadline timer case? I think a generic solution for this can be as simple as passing a feature exclusion mask to cpu_init. You could simple append a string of -feature1,-feature2 to the cpu model that is specified on creation. And that string could be defined in the compat machine descriptions. Does this make sense? Jan, to prevent misunderstanding, I elaborate my understanding of your points below (if any misunderstanding please point out to me): = Your target is, to migrate from A(old qemu) to B(new qemu) by 1. at A: qemu-version-A [-cpu whatever] // currently the default machine type is pc-A 2. at B: qemu-version-B -machine pc-A [-cpu whatever] -feature1 -feature2 B run new qemu-version-B (w/ new features 'feature1' and 'feature2'), but when B runs w/ compat '-machine pc-A', vm should not see 'feature1' and 'feature2', so commandline append string to cpu model '-cpu whatever -feature1 -feature2' to hidden new feature1 and feature2 to vm, hence vm can see same cpuid features (at B) as those at A (which means, no feature1, no feature2) = BTW, any misunderstanding or something wrong about my understanding of your target? please help me confirm. I want to make sure we are talking same thing. Thanks, Jinsong If my understanding of your thoughts is right, I think currently qemu has satisfied your target, code refer to pc_cpus_init(cpu_model) .. cpu_init(cpu_model) -- cpu_x86_register(*env, cpu_model) -- cpu_x86_find_by_name(*def, cpu_model) // parse '+/- features', generate feature masks plus_features... // and minus_features...(this is feature exclusion masks you want) I think your point 'define in the compat machine description' is unnecessary. As for 'tsc deadline' feature exposing, my patch (as attached) just obey qemu general cpuid exposing method, and also satisfied your target I think. Thanks, Jinsong -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] KVM: expose Intel cpu new features to guest
Avi, Any comments? Thanks, Jinsong Liu, Jinsong wrote: From ecd8be962f69393c183f941bfdbd7a7d3876d442 Mon Sep 17 00:00:00 2001 From: Liu, Jinsong jinsong@intel.com Date: Mon, 27 Feb 2012 05:19:32 +0800 Subject: [PATCH] KVM: expose Intel cpu new features to guest Intel recently release 2 new features, HLE and TRM. Refer to http://software.intel.com/file/41417. This patch expose them to guest. Signed-off-by: Liu, Jinsong jinsong@intel.com --- arch/x86/include/asm/cpufeature.h |2 ++ arch/x86/kvm/cpuid.c |3 ++- 2 files changed, 4 insertions(+), 1 deletions(-) diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index 17c5d4b..e8d12a8 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -198,10 +198,12 @@ /* Intel-defined CPU features, CPUID level 0x0007:0 (ebx), word 9 */ #define X86_FEATURE_FSGSBASE(9*32+ 0) /* {RD/WR}{FS/GS}BASE instructions*/ #define X86_FEATURE_BMI1 (9*32+ 3) /* 1st group bit manipulation extensions */ +#define X86_FEATURE_HLE (9*32+ 4) /* Hardware Lock Elision */ #define X86_FEATURE_AVX2(9*32+ 5) /* AVX2 instructions */ #define X86_FEATURE_SMEP (9*32+ 7) /* Supervisor Mode Execution Protection */ #define X86_FEATURE_BMI2(9*32+ 8) /* 2nd group bit manipulation extensions */ #define X86_FEATURE_ERMS (9*32+ 9) /* Enhanced REP MOVSB/STOSB */ +#define X86_FEATURE_RTM (9*32+11) /* Restricted Transactional Memory */ #if defined(__KERNEL__) !defined(__ASSEMBLY__) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 9fed5be..c2134b8 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -247,7 +247,8 @@ static int do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* cpuid 7.0.ebx */ const u32 kvm_supported_word9_x86_features = - F(FSGSBASE) | F(BMI1) | F(AVX2) | F(SMEP) | F(BMI2) | F(ERMS); + F(FSGSBASE) | F(BMI1) | F(HLE) | F(AVX2) | F(SMEP) | + F(BMI2) | F(ERMS) | F(RTM); /* all calls to cpuid_count() should be made on the same cpu */ get_cpu(); -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [Qemu-devel] [PATCH 2/2] Expose tsc deadline timer cpuid to guest
Jan, Any comments? I feel some confused about your point 'disable cpuid feature for older machine types by default': are you planning a common approach for this common issue, or, you just ask me a specific solution for the tsc deadline timer case? Thanks, Jinsong Liu, Jinsong wrote: My point is that qemu-version-A [-cpu whatever] should provide the same VM as qemu-version-B -machine pc-A [-cpu whatever] specifically if you leave out the cpu specification. So the compat machine could establish a feature mask (e.g. append some -tsc_deadline in this case). But, indeed, we need a new channel for this. Yes, if such requirement need to be satisfied, I agree we need a new channel to solve this kind of common issue. As for tsc deadline timer feature exposing, I write an updated patch as attached. 1). It exposes tsc deadline timer feature to guest if in-kernel irqchip is used and kvm has emulated tsc deadline timer; 2). It also authorizes user to control the feature exposing via a cpu feature flag; Thanks, Jinsong From 5b7d5f459b621686e78e437010ce34748bcb9e8e Mon Sep 17 00:00:00 2001 From: Liu, Jinsong jinsong@intel.com Date: Wed, 29 Feb 2012 01:53:15 +0800 Subject: [PATCH] Expose tsc deadline timer feature to guest It exposes tsc deadline timer feature to guest if in-kernel irqchip is used and kvm has emulated tsc deadline timer. It also authorizes user to control the feature exposing via a cpu feature flag. Signed-off-by: Liu, Jinsong jinsong@intel.com --- target-i386/cpu.h |1 + target-i386/cpuid.c |2 +- target-i386/kvm.c |4 3 files changed, 6 insertions(+), 1 deletions(-) diff --git a/target-i386/cpu.h b/target-i386/cpu.h index d92be5d..3409afe 100644 --- a/target-i386/cpu.h +++ b/target-i386/cpu.h @@ -399,6 +399,7 @@ #define CPUID_EXT_X2APIC (1 21) #define CPUID_EXT_MOVBE(1 22) #define CPUID_EXT_POPCNT (1 23) +#define CPUID_EXT_TSC_DEADLINE_TIMER (1 24) #define CPUID_EXT_XSAVE(1 26) #define CPUID_EXT_OSXSAVE (1 27) #define CPUID_EXT_HYPERVISOR (1 31) diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c index b9bfeaf..ac4b79c 100644 --- a/target-i386/cpuid.c +++ b/target-i386/cpuid.c @@ -50,7 +50,7 @@ static const char *ext_feature_name[] = { fma, cx16, xtpr, pdcm, NULL, NULL, dca, sse4.1|sse4_1, sse4.2|sse4_2, x2apic, movbe, popcnt, -NULL, aes, xsave, osxsave, +tsc_deadline, aes, xsave, osxsave, avx, NULL, NULL, hypervisor, }; static const char *ext2_feature_name[] = { diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 7079e87..2639699 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -370,6 +370,10 @@ int kvm_arch_init_vcpu(CPUState *env) i = env-cpuid_ext_features CPUID_EXT_HYPERVISOR; env-cpuid_ext_features = kvm_arch_get_supported_cpuid(s, 1, 0, R_ECX); env-cpuid_ext_features |= i; +if (!kvm_irqchip_in_kernel() || +!kvm_check_extension(s, KVM_CAP_TSC_DEADLINE_TIMER)) { +env-cpuid_ext_features = ~CPUID_EXT_TSC_DEADLINE_TIMER; +} env-cpuid_ext2_features = kvm_arch_get_supported_cpuid(s, 0x8001, 0, R_EDX); -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/2] Expose tsc deadline timer cpuid to guest
My point is that qemu-version-A [-cpu whatever] should provide the same VM as qemu-version-B -machine pc-A [-cpu whatever] specifically if you leave out the cpu specification. So the compat machine could establish a feature mask (e.g. append some -tsc_deadline in this case). But, indeed, we need a new channel for this. Yes, if such requirement need to be satisfied, I agree we need a new channel to solve this kind of common issue. As for tsc deadline timer feature exposing, I write an updated patch as attached. 1). It exposes tsc deadline timer feature to guest if in-kernel irqchip is used and kvm has emulated tsc deadline timer; 2). It also authorizes user to control the feature exposing via a cpu feature flag; Thanks, Jinsong From 5b7d5f459b621686e78e437010ce34748bcb9e8e Mon Sep 17 00:00:00 2001 From: Liu, Jinsong jinsong@intel.com Date: Wed, 29 Feb 2012 01:53:15 +0800 Subject: [PATCH] Expose tsc deadline timer feature to guest It exposes tsc deadline timer feature to guest if in-kernel irqchip is used and kvm has emulated tsc deadline timer. It also authorizes user to control the feature exposing via a cpu feature flag. Signed-off-by: Liu, Jinsong jinsong@intel.com --- target-i386/cpu.h |1 + target-i386/cpuid.c |2 +- target-i386/kvm.c |4 3 files changed, 6 insertions(+), 1 deletions(-) diff --git a/target-i386/cpu.h b/target-i386/cpu.h index d92be5d..3409afe 100644 --- a/target-i386/cpu.h +++ b/target-i386/cpu.h @@ -399,6 +399,7 @@ #define CPUID_EXT_X2APIC (1 21) #define CPUID_EXT_MOVBE(1 22) #define CPUID_EXT_POPCNT (1 23) +#define CPUID_EXT_TSC_DEADLINE_TIMER (1 24) #define CPUID_EXT_XSAVE(1 26) #define CPUID_EXT_OSXSAVE (1 27) #define CPUID_EXT_HYPERVISOR (1 31) diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c index b9bfeaf..ac4b79c 100644 --- a/target-i386/cpuid.c +++ b/target-i386/cpuid.c @@ -50,7 +50,7 @@ static const char *ext_feature_name[] = { fma, cx16, xtpr, pdcm, NULL, NULL, dca, sse4.1|sse4_1, sse4.2|sse4_2, x2apic, movbe, popcnt, -NULL, aes, xsave, osxsave, +tsc_deadline, aes, xsave, osxsave, avx, NULL, NULL, hypervisor, }; static const char *ext2_feature_name[] = { diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 7079e87..2639699 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -370,6 +370,10 @@ int kvm_arch_init_vcpu(CPUState *env) i = env-cpuid_ext_features CPUID_EXT_HYPERVISOR; env-cpuid_ext_features = kvm_arch_get_supported_cpuid(s, 1, 0, R_ECX); env-cpuid_ext_features |= i; +if (!kvm_irqchip_in_kernel() || +!kvm_check_extension(s, KVM_CAP_TSC_DEADLINE_TIMER)) { +env-cpuid_ext_features = ~CPUID_EXT_TSC_DEADLINE_TIMER; +} env-cpuid_ext2_features = kvm_arch_get_supported_cpuid(s, 0x8001, 0, R_EDX); -- 1.7.1 0001-Expose-tsc-deadline-timer-feature-to-guest.patch Description: 0001-Expose-tsc-deadline-timer-feature-to-guest.patch
RE: [PATCH 2/2] Expose tsc deadline timer cpuid to guest
Jan Kiszka wrote: On 2012-01-07 19:23, Liu, Jinsong wrote: Jan Kiszka wrote: On 2012-01-05 18:07, Liu, Jinsong wrote: Sorry, it remains bogus to expose the tsc deadline timer feature on machines pc-1.1. That's just like we introduced kvmclock only to pc-0.14 onward. The reason is that guest OSes so far running on qemu-1.0 or older without deadline timer support must not find that feature when being migrated to a host with qemu-1.1 in pc-1.0 compat mode. Yes, the user can explicitly disable it, but that is not the idea of legacy machine models. They should provide the very same environment that older qemu versions offered. Not quite clear about this point. Per my understanding, if a kvm guest running on an older qemu without tsc deadline timer support, then after migrate, the guest would still cannot find tsc deadline feature, no matter older or newer host/qemu/pc-xx it migrate to. What should prevent this? The feature flags are not part of the vmstate. They are part of the vm configuration which is not migrated but defined by starting qemu on the target host. Thanks! understand this point (They are part of the vm configuration which is not migrated but defined by starting qemu on the target host). But kvmclock example still cannot satisfy the purpose guest running on old qemu/pc-0.13 without kvmclock support must not find kvmclock feature when being migrated to a host with new qemu/pc-0.13 compat mode. After migration, guest can possibily find kvmclock feature CPUID.0x4001.KVM_FEATURE_CLOCKSOURCE: pc_init1(..., kvmclock_enabled) { pc_cpus_init(cpu_model);// the point to decide and expose cpuid features to guest if (kvmclock_enabled) {// the difference point between pc-0.13 vs. pc-0.14, related nothing to cpuid features. kvmclock_create(); } } Right, not a perfect example: the cpuid feature is not influenced by this mechanism, only the fact if a kvmclock device (for save/restore) should be created. I guess we ignored this back then, only focusing on the more obvious issue of the addition device. Seems currently there is no good way to satisfy guest running on old qemu/pc-xx without feature A support must not find feature A when being migrated to a host with new qemu/pc-xx compat mode, i.e. considering * if running with '-cpu host' then migrate; * each time we add a new cpuid feature it need add one or more new machine model? is it necessary to bind pc-xx with cpuid feature? * logically cpuid features should better be controlled by cpu model, not by machine model. The compatibility machines define the possible cpu models. If I select How does machine define possible cpu models? cpu model defined by qemu option '-cpu ...', while machine model defined by '-machine ...' pc-0.14, e.g. -cpu kvm64 should not give me features that 0.14 was not exposing. in such case, it's '-cpu kvm64' who take effect to decide what cpuid features would exposed to guest, not '-machine pc-0.14'. IMO, what our patch need to do is to expose a cpuid feature to guest (CPUID.01H:ECX.TSC_Deadline[bit 24]), it decided by cpu model, not machine model: pc_init1(..., cpu_model, ...) { pc_cpus_init(cpu_model); // this is the whole logic exposing cpuid features to guest ... } Do I misunderstanding something? Thanks, Jinsong-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] KVM: expose Intel cpu new features to guest
From ecd8be962f69393c183f941bfdbd7a7d3876d442 Mon Sep 17 00:00:00 2001 From: Liu, Jinsong jinsong@intel.com Date: Mon, 27 Feb 2012 05:19:32 +0800 Subject: [PATCH] KVM: expose Intel cpu new features to guest Intel recently release 2 new features, HLE and TRM. Refer to http://software.intel.com/file/41417. This patch expose them to guest. Signed-off-by: Liu, Jinsong jinsong@intel.com --- arch/x86/include/asm/cpufeature.h |2 ++ arch/x86/kvm/cpuid.c |3 ++- 2 files changed, 4 insertions(+), 1 deletions(-) diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index 17c5d4b..e8d12a8 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -198,10 +198,12 @@ /* Intel-defined CPU features, CPUID level 0x0007:0 (ebx), word 9 */ #define X86_FEATURE_FSGSBASE (9*32+ 0) /* {RD/WR}{FS/GS}BASE instructions*/ #define X86_FEATURE_BMI1 (9*32+ 3) /* 1st group bit manipulation extensions */ +#define X86_FEATURE_HLE(9*32+ 4) /* Hardware Lock Elision */ #define X86_FEATURE_AVX2 (9*32+ 5) /* AVX2 instructions */ #define X86_FEATURE_SMEP (9*32+ 7) /* Supervisor Mode Execution Protection */ #define X86_FEATURE_BMI2 (9*32+ 8) /* 2nd group bit manipulation extensions */ #define X86_FEATURE_ERMS (9*32+ 9) /* Enhanced REP MOVSB/STOSB */ +#define X86_FEATURE_RTM(9*32+11) /* Restricted Transactional Memory */ #if defined(__KERNEL__) !defined(__ASSEMBLY__) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 9fed5be..c2134b8 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -247,7 +247,8 @@ static int do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* cpuid 7.0.ebx */ const u32 kvm_supported_word9_x86_features = - F(FSGSBASE) | F(BMI1) | F(AVX2) | F(SMEP) | F(BMI2) | F(ERMS); + F(FSGSBASE) | F(BMI1) | F(HLE) | F(AVX2) | F(SMEP) | + F(BMI2) | F(ERMS) | F(RTM); /* all calls to cpuid_count() should be made on the same cpu */ get_cpu(); -- 1.7.1 0001-KVM-expose-Intel-cpu-new-features-to-guest.patch Description: 0001-KVM-expose-Intel-cpu-new-features-to-guest.patch
RE: [PATCH 2/2] Expose tsc deadline timer cpuid to guest
Jan Kiszka wrote: On 2012-01-05 18:07, Liu, Jinsong wrote: Sorry, it remains bogus to expose the tsc deadline timer feature on machines pc-1.1. That's just like we introduced kvmclock only to pc-0.14 onward. The reason is that guest OSes so far running on qemu-1.0 or older without deadline timer support must not find that feature when being migrated to a host with qemu-1.1 in pc-1.0 compat mode. Yes, the user can explicitly disable it, but that is not the idea of legacy machine models. They should provide the very same environment that older qemu versions offered. Not quite clear about this point. Per my understanding, if a kvm guest running on an older qemu without tsc deadline timer support, then after migrate, the guest would still cannot find tsc deadline feature, no matter older or newer host/qemu/pc-xx it migrate to. What should prevent this? The feature flags are not part of the vmstate. They are part of the vm configuration which is not migrated but defined by starting qemu on the target host. Thanks! understand this point (They are part of the vm configuration which is not migrated but defined by starting qemu on the target host). But kvmclock example still cannot satisfy the purpose guest running on old qemu/pc-0.13 without kvmclock support must not find kvmclock feature when being migrated to a host with new qemu/pc-0.13 compat mode. After migration, guest can possibily find kvmclock feature CPUID.0x4001.KVM_FEATURE_CLOCKSOURCE: pc_init1(..., kvmclock_enabled) { pc_cpus_init(cpu_model);// the point to decide and expose cpuid features to guest if (kvmclock_enabled) {// the difference point between pc-0.13 vs. pc-0.14, related nothing to cpuid features. kvmclock_create(); } } Seems currently there is no good way to satisfy guest running on old qemu/pc-xx without feature A support must not find feature A when being migrated to a host with new qemu/pc-xx compat mode, i.e. considering * if running with '-cpu host' then migrate; * each time we add a new cpuid feature it need add one or more new machine model? is it necessary to bind pc-xx with cpuid feature? * logically cpuid features should better be controlled by cpu model, not by machine model. Regards, Jinsong-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/2] Expose tsc deadline timer cpuid to guest
This requires some logic change and then rewording: - enable TSC deadline timer support by default if in-kernel irqchip is used - disable it on user request via a cpu feature flag Yes, the logic has been implemented by the former patch as: +if (env-tsc_deadline_timer_enabled) { // user control, default is to authorize tsc deadline timer feature +if (kvm_irqchip_in_kernel() // in-kerenl irqchip is used +kvm_check_extension(s, KVM_CAP_TSC_DEADLINE_TIMER)) { +env-cpuid_ext_features |= CPUID_EXT_TSC_DEADLINE_TIMER; +} +} - disable it for older machine types (see below) by default TSC deadline timer emulation in user space is a different story to be told once we have a patch for it. Signed-off-by: Liu, Jinsong jinsong@intel.com --- target-i386/cpu.h |2 ++ target-i386/cpuid.c |7 ++- target-i386/kvm.c | 13 + 3 files changed, 21 insertions(+), 1 deletions(-) diff --git a/target-i386/cpu.h b/target-i386/cpu.h index 177d8aa..f2d0ad5 100644 --- a/target-i386/cpu.h +++ b/target-i386/cpu.h @@ -399,6 +399,7 @@ #define CPUID_EXT_X2APIC (1 21) #define CPUID_EXT_MOVBE(1 22) #define CPUID_EXT_POPCNT (1 23) +#define CPUID_EXT_TSC_DEADLINE_TIMER (1 24) #define CPUID_EXT_XSAVE(1 26) #define CPUID_EXT_OSXSAVE (1 27) #define CPUID_EXT_HYPERVISOR (1 31) @@ -693,6 +694,7 @@ typedef struct CPUX86State { uint64_t tsc; uint64_t tsc_deadline; +bool tsc_deadline_timer_enabled; uint64_t mcg_status; uint64_t msr_ia32_misc_enable; diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c index 0b3af90..fe749e0 100644 --- a/target-i386/cpuid.c +++ b/target-i386/cpuid.c @@ -48,7 +48,7 @@ static const char *ext_feature_name[] = { fma, cx16, xtpr, pdcm, NULL, NULL, dca, sse4.1|sse4_1, sse4.2|sse4_2, x2apic, movbe, popcnt, -NULL, aes, xsave, osxsave, +tsc_deadline, aes, xsave, osxsave, avx, NULL, NULL, hypervisor, }; static const char *ext2_feature_name[] = { @@ -225,6 +225,7 @@ typedef struct x86_def_t { int model; int stepping; int tsc_khz; +bool tsc_deadline_timer_enabled; uint32_t features, ext_features, ext2_features, ext3_features; uint32_t kvm_features, svm_features; uint32_t xlevel; @@ -742,6 +743,9 @@ static int cpu_x86_find_by_name(x86_def_t *x86_cpu_def, const char *cpu_model) x86_cpu_def-ext3_features = ~minus_ext3_features; x86_cpu_def-kvm_features = ~minus_kvm_features; x86_cpu_def-svm_features = ~minus_svm_features; +/* Defaultly user don't against tsc_deadline_timer */ +x86_cpu_def-tsc_deadline_timer_enabled = +!(minus_ext_features CPUID_EXT_TSC_DEADLINE_TIMER); if (check_cpuid) { if (check_features_against_host(x86_cpu_def) enforce_cpuid) goto error; @@ -885,6 +889,7 @@ int cpu_x86_register (CPUX86State *env, const char *cpu_model) env-cpuid_ext4_features = def-ext4_features; env-cpuid_xlevel2 = def-xlevel2; env-tsc_khz = def-tsc_khz; + env-tsc_deadline_timer_enabled = def-tsc_deadline_timer_enabled; if (!kvm_enabled()) { env-cpuid_features = TCG_FEATURES; env-cpuid_ext_features = TCG_EXT_FEATURES; diff --git a/target-i386/kvm.c b/target-i386/kvm.c index d50de90..79baf0b 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -370,6 +370,19 @@ int kvm_arch_init_vcpu(CPUState *env) i = env-cpuid_ext_features CPUID_EXT_HYPERVISOR; env-cpuid_ext_features = kvm_arch_get_supported_cpuid(s, 1, 0, R_ECX); env-cpuid_ext_features |= i; +/* + * 1. Considering live migration, user enable/disable tsc deadline timer; + * 2. If guest use kvm apic and kvm emulate tsc deadline timer, expose it; + * 3. If in the future qemu support tsc deadline timer emulation, + *and guest use qemu apic, add cpuid exposing case then. + */ See above. Also, I don't think this comment applies very well to this function. Yes, the comment is indeed ambiguous. Would elaborate more clear. +env-cpuid_ext_features = ~CPUID_EXT_TSC_DEADLINE_TIMER; Can that feature possibly be set in cpuid_ext_features? I thought the kernel now refrains from this. Yes, it's possible. Kernel didn't refrain it, just let qemu to make decision. +if (env-tsc_deadline_timer_enabled) { +if (kvm_irqchip_in_kernel() +kvm_check_extension(s, KVM_CAP_TSC_DEADLINE_TIMER)) { +env-cpuid_ext_features |= CPUID_EXT_TSC_DEADLINE_TIMER; +} +} env-cpuid_ext2_features = kvm_arch_get_supported_cpuid(s, 0x8001, 0, R_EDX); Sorry, it remains bogus to expose the tsc deadline timer feature on machines pc-1.1. That's just like we introduced kvmclock only to pc-0.14 onward. The reason is that guest OSes so far
RE: [PATCH] Expose tsc deadline timer cpuid to guest
Jan Kiszka wrote: On 2011-12-28 18:35, Liu, Jinsong wrote: diff --git a/qemu-kvm.h b/qemu-kvm.h index 2bd5602..8c6c2ea 100644 --- a/qemu-kvm.h +++ b/qemu-kvm.h @@ -260,6 +260,7 @@ extern int kvm_irqchip; extern int kvm_pit; extern int kvm_pit_reinject; extern unsigned int kvm_shadow_memory; +extern int tsc_deadline_timer; int kvm_handle_tpr_access(CPUState *env); void kvm_tpr_enable_vapic(CPUState *env); diff --git a/qemu-options.hx b/qemu-options.hx index f6df6b9..eff6644 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -2619,6 +2619,9 @@ DEF(no-kvm-pit-reinjection, 0, QEMU_OPTION_no_kvm_pit_reinjection, -no-kvm-pit-reinjection\n disable KVM kernel mode PIT interrupt reinjection\n, QEMU_ARCH_I386) +DEF(no-tsc-deadline-timer, 0, QEMU_OPTION_no_tsc_deadline_timer, +-no-tsc-deadline-timer disable tsc deadline timer\n, +QEMU_ARCH_I386) Hmm, I would really prefer to stop adding switches like this. They won't make it upstream anyway. OK, I will try to write a patch w/ better user control cpuid method, i.e. by plus_features and minus_features. Yep, that would be better. Thanks, patch updated according to above idea, and sent out. Can't this control be attached to legacy qemu machine models, ie. here anything = pc-1.0? See how we handle kvmclock. You mean, by adding input para like pc_init1(..., kvmclock_enabled, tscdeadline_enabled)? I think that's not a good way. I think it is mandatory as older qemu versions won't expose tscdeadline to the guest, thus newer versions must not do this when emulating older machines. Hmm, if an old qemu machine runs at a new host platform (say, -cpu host), it would expose many *new* cpuid features to guest. IMO, qemu machine is a *virt* platform for guest, tsc deadline timer is a cpuid features, not much necessary to be bound to some qemu machine version. whether tsc deadline timer cpuid expose to guest can only decided by: 1. user authorize enable (default yes) 2. kvm_irqchip_in_kernel 3. KVM_CAP_TSC_DEADLINE_TIMER If yes, it can be exposed to guest, and would not break anything no matter what qemu machine version is. Thanks, Jinsong With more and more cpuid features (N) controlled in this way, machine models would be 2^N. We likely need a better way to express this via code, I agree. Likely something declarative as for compat_props. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] Expose tsc deadline timer cpuid to guest
diff --git a/qemu-kvm.h b/qemu-kvm.h index 2bd5602..8c6c2ea 100644 --- a/qemu-kvm.h +++ b/qemu-kvm.h @@ -260,6 +260,7 @@ extern int kvm_irqchip; extern int kvm_pit; extern int kvm_pit_reinject; extern unsigned int kvm_shadow_memory; +extern int tsc_deadline_timer; int kvm_handle_tpr_access(CPUState *env); void kvm_tpr_enable_vapic(CPUState *env); diff --git a/qemu-options.hx b/qemu-options.hx index f6df6b9..eff6644 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -2619,6 +2619,9 @@ DEF(no-kvm-pit-reinjection, 0, QEMU_OPTION_no_kvm_pit_reinjection, -no-kvm-pit-reinjection\n disable KVM kernel mode PIT interrupt reinjection\n, QEMU_ARCH_I386) +DEF(no-tsc-deadline-timer, 0, QEMU_OPTION_no_tsc_deadline_timer, +-no-tsc-deadline-timer disable tsc deadline timer\n, +QEMU_ARCH_I386) Hmm, I would really prefer to stop adding switches like this. They won't make it upstream anyway. OK, I will try to write a patch w/ better user control cpuid method, i.e. by plus_features and minus_features. Can't this control be attached to legacy qemu machine models, ie. here anything = pc-1.0? See how we handle kvmclock. You mean, by adding input para like pc_init1(..., kvmclock_enabled, tscdeadline_enabled)? I think that's not a good way. With more and more cpuid features (N) controlled in this way, machine models would be 2^N. Thanks, Jinsong-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] Define KVM_CAP_TSC_DEADLINE_TIMER
From 5afecc308bc25c7fd8d124e7557f08fb067d6caa Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Thu, 29 Dec 2011 01:45:45 +0800 Subject: [PATCH 1/2] Define KVM_CAP_TSC_DEADLINE_TIMER Signed-off-by: Liu, Jinsong jinsong@intel.com Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- linux-headers/linux/kvm.h |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h index a8761d3..1d3a4f4 100644 --- a/linux-headers/linux/kvm.h +++ b/linux-headers/linux/kvm.h @@ -558,6 +558,7 @@ struct kvm_ppc_pvinfo { #define KVM_CAP_PPC_PAPR 68 #define KVM_CAP_SW_TLB 69 #define KVM_CAP_ONE_REG 70 +#define KVM_CAP_TSC_DEADLINE_TIMER 72 #ifdef KVM_CAP_IRQ_ROUTING -- 1.6.5.6 0001-Define-KVM_CAP_TSC_DEADLINE_TIMER.patch Description: 0001-Define-KVM_CAP_TSC_DEADLINE_TIMER.patch
[PATCH 2/2] Expose tsc deadline timer cpuid to guest
From 3a78adf8006ec6189bfe2f55f7ae213e75bf3815 Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Thu, 29 Dec 2011 05:28:12 +0800 Subject: [PATCH 2/2] Expose tsc deadline timer cpuid to guest Depend on several factors: 1. Considering live migration, user enable/disable tsc deadline timer; 2. If guest use kvm apic and kvm emulate tsc deadline timer, expose it; 3. If in the future qemu support tsc deadline timer emulation, and guest use qemu apic, add cpuid exposing case then. Signed-off-by: Liu, Jinsong jinsong@intel.com --- target-i386/cpu.h |2 ++ target-i386/cpuid.c |7 ++- target-i386/kvm.c | 13 + 3 files changed, 21 insertions(+), 1 deletions(-) diff --git a/target-i386/cpu.h b/target-i386/cpu.h index 177d8aa..f2d0ad5 100644 --- a/target-i386/cpu.h +++ b/target-i386/cpu.h @@ -399,6 +399,7 @@ #define CPUID_EXT_X2APIC (1 21) #define CPUID_EXT_MOVBE(1 22) #define CPUID_EXT_POPCNT (1 23) +#define CPUID_EXT_TSC_DEADLINE_TIMER (1 24) #define CPUID_EXT_XSAVE(1 26) #define CPUID_EXT_OSXSAVE (1 27) #define CPUID_EXT_HYPERVISOR (1 31) @@ -693,6 +694,7 @@ typedef struct CPUX86State { uint64_t tsc; uint64_t tsc_deadline; +bool tsc_deadline_timer_enabled; uint64_t mcg_status; uint64_t msr_ia32_misc_enable; diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c index 0b3af90..fe749e0 100644 --- a/target-i386/cpuid.c +++ b/target-i386/cpuid.c @@ -48,7 +48,7 @@ static const char *ext_feature_name[] = { fma, cx16, xtpr, pdcm, NULL, NULL, dca, sse4.1|sse4_1, sse4.2|sse4_2, x2apic, movbe, popcnt, -NULL, aes, xsave, osxsave, +tsc_deadline, aes, xsave, osxsave, avx, NULL, NULL, hypervisor, }; static const char *ext2_feature_name[] = { @@ -225,6 +225,7 @@ typedef struct x86_def_t { int model; int stepping; int tsc_khz; +bool tsc_deadline_timer_enabled; uint32_t features, ext_features, ext2_features, ext3_features; uint32_t kvm_features, svm_features; uint32_t xlevel; @@ -742,6 +743,9 @@ static int cpu_x86_find_by_name(x86_def_t *x86_cpu_def, const char *cpu_model) x86_cpu_def-ext3_features = ~minus_ext3_features; x86_cpu_def-kvm_features = ~minus_kvm_features; x86_cpu_def-svm_features = ~minus_svm_features; +/* Defaultly user don't against tsc_deadline_timer */ +x86_cpu_def-tsc_deadline_timer_enabled = +!(minus_ext_features CPUID_EXT_TSC_DEADLINE_TIMER); if (check_cpuid) { if (check_features_against_host(x86_cpu_def) enforce_cpuid) goto error; @@ -885,6 +889,7 @@ int cpu_x86_register (CPUX86State *env, const char *cpu_model) env-cpuid_ext4_features = def-ext4_features; env-cpuid_xlevel2 = def-xlevel2; env-tsc_khz = def-tsc_khz; +env-tsc_deadline_timer_enabled = def-tsc_deadline_timer_enabled; if (!kvm_enabled()) { env-cpuid_features = TCG_FEATURES; env-cpuid_ext_features = TCG_EXT_FEATURES; diff --git a/target-i386/kvm.c b/target-i386/kvm.c index d50de90..79baf0b 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -370,6 +370,19 @@ int kvm_arch_init_vcpu(CPUState *env) i = env-cpuid_ext_features CPUID_EXT_HYPERVISOR; env-cpuid_ext_features = kvm_arch_get_supported_cpuid(s, 1, 0, R_ECX); env-cpuid_ext_features |= i; +/* + * 1. Considering live migration, user enable/disable tsc deadline timer; + * 2. If guest use kvm apic and kvm emulate tsc deadline timer, expose it; + * 3. If in the future qemu support tsc deadline timer emulation, + *and guest use qemu apic, add cpuid exposing case then. + */ +env-cpuid_ext_features = ~CPUID_EXT_TSC_DEADLINE_TIMER; +if (env-tsc_deadline_timer_enabled) { +if (kvm_irqchip_in_kernel() +kvm_check_extension(s, KVM_CAP_TSC_DEADLINE_TIMER)) { +env-cpuid_ext_features |= CPUID_EXT_TSC_DEADLINE_TIMER; +} +} env-cpuid_ext2_features = kvm_arch_get_supported_cpuid(s, 0x8001, 0, R_EDX); -- 1.6.5.6 0002-Expose-tsc-deadline-timer-cpuid-to-guest.patch Description: 0002-Expose-tsc-deadline-timer-cpuid-to-guest.patch
RE: [PATCH v2] KVM: x86: Prevent exposing TSC deadline timer feature in the absence of in-kernel APIC
Avi Kivity wrote: On 12/22/2011 05:41 PM, Liu, Jinsong wrote: Avi Kivity wrote: On 12/21/2011 12:25 PM, Jan Kiszka wrote: We must not report the TSC deadline timer feature on our own when user space provides the APIC as we have no clue about its features. We must not report the TSC deadline timer feature on our own, period. We should just update the timer mode mask there. Don't know how this slipped through review. I think your original idea was correct. Add a new KVM_CAP for the tsc deadline timer. Userspace can add the bit to cpuid if either it implements the feature in a userspace apic, or if it finds the new capability and uses the kernel apic. Is it necessary to use KVM_CAP? If I didn't misunderstand, the KVM_CAP sulotion would be: 1. qemu get kvm tsc deadline timer capability by KVM_CAP_...; 2. qemu add cpuid bit if ((guest use qemu apic qemu emualte tsc deadline timer) || (guest use kvm apic kvm emulate tsc deadline timer (KVM_CAP))) 3. qemu ioctl KVM_SET_CPUID2 4. kvm expose the feature to guest by saving it at vcpu-arch.cpuid_entries, Correct. seems it's logically redundant. What's logically redundant? Jan's patch v2 is a straight forward and simple fix. in the patch if (apic) { ... } means apic (and then its sub-logic tsc deadline timer) emulated by kvm, that's enough: if quest use kvm apic, it's OK to add cpuid bit and expose to guest; if guest don't use kvm apic, it will not touch cpuid bit; It breaks live migration: if you start a guest on a TSC-deadline capable host kernel, and migrate it to a TSC-deadline incapable host kernel, you end up with a broken guest. More broadly, kvm never exposes features transparently to the guest, it always passes them to userspace first, so userspace controls the ABI exposed to the guest. This prevents the following scenario: Do you mean, by the method qemu control cpuid exposing, it can avoid live migration broken issue by 1. user probe the lowest ability host of whole pool where vm may live migrate; 2. only if the lowest ablility host support the feature can user enable the feature when boot a vm; 3. if the lowest ability host didn't support the feature (say tsc deadline timer as example), user disable the feature when boot a vm; In this way, live migration wouldn't be broken. Right? or, do you mean qemu-kvm solve live migration broken issue by some other method? - a guest is started on some hardware, which doesn't support some cpuid feature (say AVX for example) - the guest or one of its applications are broken wrt AVX, but because the feature is not exposed, it works correctly - the host hardware is upgraded to one which supports AVX - the guest is now broken You mean, live migrate from 'old' (which doesn't support the feature) platform to 'new' platform would broken? Thanks, Jinsong-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] KVM: Don't automatically expose the TSC deadline timer in cpuid
Avi Kivity wrote: From: Jan Kiszka jan.kis...@siemens.com Unlike all of the other cpuid bits, the TSC deadline timer bit is set unconditionally, regardless of what userspace wants. This is broken in several ways: - if userspace doesn't use KVM_CREATE_IRQCHIP, and doesn't emulate the TSC deadline timer feature, a guest that uses the feature will break - live migration to older host kernels that don't support the TSC deadline timer will cause the feature to be pulled from under the guest's feet; breaking it - guests that are broken wrt the feature will fail. Fix by not enabling the feature automatically; instead report it to userspace. Because the feature depends on KVM_CREATE_IRQCHIP, which we cannot guarantee will be called, we expose it via a KVM_CAP_TSC_DEADLINE_TIMER and not KVM_GET_SUPPORTED_CPUID. Fixes the Illumos guest kernel, which uses the TSC deadline timer feature. [avi: add the KVM_CAP + documentation] Reported-by: Alexey Zaytsev alexey.zayt...@gmail.com Signed-off-by: Avi Kivity a...@redhat.com --- As we're running out of time and everyone's checking their socks instead of inboxes I've added the missing parts myself. Jan, if you accidentally see this, please review and add your signoff. Documentation/virtual/kvm/api.txt |9 + arch/x86/kvm/cpuid.c | 16 ++-- arch/x86/kvm/x86.c|3 +++ include/linux/kvm.h |1 + 4 files changed, 19 insertions(+), 10 deletions(-) diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 5b03eee..da1f8fd 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -1100,6 +1100,15 @@ emulate them efficiently. The fields in each entry are defined as follows: eax, ebx, ecx, edx: the values returned by the cpuid instruction for this function/index combination +The TSC deadline timer feature (CPUID leaf 1, ecx[24]) is always returned +as false, since the feature depends on KVM_CREATE_IRQCHIP for local APIC +support. Instead it is reported via + + ioctl(KVM_CHECK_EXTENSION, KVM_CAP_TSC_DEADLINE_TIMER) + +if that returns true you use KVM_CREATE_IRQCHIP, or if emulate the +feature in userspace, then you can enable the feature for KVM_SET_CPUID2. + 4.47 KVM_PPC_GET_PVINFO Capability: KVM_CAP_PPC_GET_PVINFO diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 230f713..89b02bf 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -27,7 +27,6 @@ void kvm_update_cpuid(struct kvm_vcpu *vcpu) { struct kvm_cpuid_entry2 *best; struct kvm_lapic *apic = vcpu-arch.apic; - u32 timer_mode_mask; best = kvm_find_cpuid_entry(vcpu, 1, 0); if (!best) @@ -40,15 +39,12 @@ void kvm_update_cpuid(struct kvm_vcpu *vcpu) best-ecx |= bit(X86_FEATURE_OSXSAVE); } - if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL - best-function == 0x1) { - best-ecx |= bit(X86_FEATURE_TSC_DEADLINE_TIMER); - timer_mode_mask = 3 17; - } else - timer_mode_mask = 1 17; - - if (apic) - apic-lapic_timer.timer_mode_mask = timer_mode_mask; + if (apic) { + if (best-ecx bit(X86_FEATURE_TSC_DEADLINE_TIMER)) + apic-lapic_timer.timer_mode_mask = 3 17; + else + apic-lapic_timer.timer_mode_mask = 1 17; + } kvm_pmu_cpuid_update(vcpu); } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index df23dff..1171def 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2089,6 +2089,9 @@ int kvm_dev_ioctl_check_extension(long ext) case KVM_CAP_TSC_CONTROL: r = kvm_has_tsc_control; break; + case KVM_CAP_TSC_DEADLINE_TIMER: + r = boot_cpu_has(X86_FEATURE_TSC_DEADLINE_TIMER); + break; kvm tsc deadline timer is pure software emulated, not depend on host physically. Thanks, Jinsong default: r = 0; break; diff --git a/include/linux/kvm.h b/include/linux/kvm.h index c3892fc..68e67e5 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -557,6 +557,7 @@ struct kvm_ppc_pvinfo { #define KVM_CAP_MAX_VCPUS 66 /* returns max vcpus per vm */ #define KVM_CAP_PPC_PAPR 68 #define KVM_CAP_S390_GMAP 71 +#define KVM_CAP_TSC_DEADLINE_TIMER 72 #ifdef KVM_CAP_IRQ_ROUTING -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] KVM: Don't automatically expose the TSC deadline timer in cpuid
Sasha Levin wrote: On Sun, 2011-12-25 at 21:00 +0200, Sasha Levin wrote: On Sun, 2011-12-25 at 15:03 +0200, Avi Kivity wrote: + if (apic) { + if (best-ecx bit(X86_FEATURE_TSC_DEADLINE_TIMER)) + apic-lapic_timer.timer_mode_mask = 3 17; + else + apic-lapic_timer.timer_mode_mask = 1 17; + } Can we change these to be: if(...) apic-lapic_timer.timer_mode_mask = APIC_LVT_TIMER_PERIODIC | APIC_LVT_TIMER_TSCDEADLINE; else apic-lapic_timer.timer_mode_mask = APIC_LVT_TIMER_PERIODIC; Actually, apic-lapic_timer.timer_mode_mask = APIC_LVT_TIMER_PERIODIC; if(...) apic-lapic_timer.timer_mode_mask |= APIC_LVT_TIMER_TSCDEADLINE; Is it good semantically? APIC_LVT_TIMER_PERIODIC and APIC_LVT_TIMER_TSCDEADLINE is timer mode, where timer_mode_mask is timer mode mask. Thanks, Jinsong-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH v2] KVM: x86: Prevent exposing TSC deadline timer feature in the absence of in-kernel APIC
Avi Kivity wrote: On 12/26/2011 10:11 AM, Liu, Jinsong wrote: It breaks live migration: if you start a guest on a TSC-deadline capable host kernel, and migrate it to a TSC-deadline incapable host kernel, you end up with a broken guest. More broadly, kvm never exposes features transparently to the guest, it always passes them to userspace first, so userspace controls the ABI exposed to the guest. This prevents the following scenario: Do you mean, by the method qemu control cpuid exposing, it can avoid live migration broken issue by 1. user probe the lowest ability host of whole pool where vm may live migrate; 2. only if the lowest ablility host support the feature can user enable the feature when boot a vm; 3. if the lowest ability host didn't support the feature (say tsc deadline timer as example), user disable the feature when boot a vm; In this way, live migration wouldn't be broken. Right? Right. Thanks Avi, for your detailed explanation, fix a long misunderstanding I had for live migration. Best Regards, Jinsong or, do you mean qemu-kvm solve live migration broken issue by some other method? The method you outlined, or any other method, such as partitioning the cluster according to hardware capabilities. - a guest is started on some hardware, which doesn't support some cpuid feature (say AVX for example) - the guest or one of its applications are broken wrt AVX, but because the feature is not exposed, it works correctly - the host hardware is upgraded to one which supports AVX - the guest is now broken You mean, live migrate from 'old' (which doesn't support the feature) platform to 'new' platform would broken? Live migration, or even just a guest restart on updated hardware. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Expose tsc deadline timer cpuid to guest
From 19caf1db1f93e6f6b736e1dfd5e91a0c7669adec Mon Sep 17 00:00:00 2001 From: Liu Jinsong jinsong@intel.com Date: Tue, 27 Dec 2011 04:08:27 +0800 Subject: [PATCH] Expose tsc deadline timer cpuid to guest Depend on several factors: 1. Considering live migration, user enable/disable tsc deadline timer; 2. If guest use kvm apic and kvm emulate tsc deadline timer, expose it; 3. If in the future qemu support tsc deadline timer emulation, and guest use qemu apic, add cpuid exposing case then. Signed-off-by: Liu, Jinsong jinsong@intel.com --- linux-headers/linux/kvm.h |1 + qemu-kvm.h|1 + qemu-options.hx |3 +++ target-i386/cpu.h |1 + target-i386/kvm.c | 12 vl.c |4 6 files changed, 22 insertions(+), 0 deletions(-) diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h index a8761d3..1d3a4f4 100644 --- a/linux-headers/linux/kvm.h +++ b/linux-headers/linux/kvm.h @@ -558,6 +558,7 @@ struct kvm_ppc_pvinfo { #define KVM_CAP_PPC_PAPR 68 #define KVM_CAP_SW_TLB 69 #define KVM_CAP_ONE_REG 70 +#define KVM_CAP_TSC_DEADLINE_TIMER 72 #ifdef KVM_CAP_IRQ_ROUTING diff --git a/qemu-kvm.h b/qemu-kvm.h index 2bd5602..8c6c2ea 100644 --- a/qemu-kvm.h +++ b/qemu-kvm.h @@ -260,6 +260,7 @@ extern int kvm_irqchip; extern int kvm_pit; extern int kvm_pit_reinject; extern unsigned int kvm_shadow_memory; +extern int tsc_deadline_timer; int kvm_handle_tpr_access(CPUState *env); void kvm_tpr_enable_vapic(CPUState *env); diff --git a/qemu-options.hx b/qemu-options.hx index f6df6b9..eff6644 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -2619,6 +2619,9 @@ DEF(no-kvm-pit-reinjection, 0, QEMU_OPTION_no_kvm_pit_reinjection, -no-kvm-pit-reinjection\n disable KVM kernel mode PIT interrupt reinjection\n, QEMU_ARCH_I386) +DEF(no-tsc-deadline-timer, 0, QEMU_OPTION_no_tsc_deadline_timer, +-no-tsc-deadline-timer disable tsc deadline timer\n, +QEMU_ARCH_I386) DEF(tdf, 0, QEMU_OPTION_tdf, -tdfenable guest time drift compensation\n, QEMU_ARCH_ALL) DEF(kvm-shadow-memory, HAS_ARG, QEMU_OPTION_kvm_shadow_memory, diff --git a/target-i386/cpu.h b/target-i386/cpu.h index 177d8aa..767d2eb 100644 --- a/target-i386/cpu.h +++ b/target-i386/cpu.h @@ -399,6 +399,7 @@ #define CPUID_EXT_X2APIC (1 21) #define CPUID_EXT_MOVBE(1 22) #define CPUID_EXT_POPCNT (1 23) +#define CPUID_EXT_TSC_DEADLINE_TIMER (1 24) #define CPUID_EXT_XSAVE(1 26) #define CPUID_EXT_OSXSAVE (1 27) #define CPUID_EXT_HYPERVISOR (1 31) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index d50de90..740184d 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -370,6 +370,18 @@ int kvm_arch_init_vcpu(CPUState *env) i = env-cpuid_ext_features CPUID_EXT_HYPERVISOR; env-cpuid_ext_features = kvm_arch_get_supported_cpuid(s, 1, 0, R_ECX); env-cpuid_ext_features |= i; +/* + * 1. Considering live migration, user enable/disable tsc deadline timer; + * 2. If guest use kvm apic and kvm emulate tsc deadline timer, expose it; + * 3. If in the future qemu support tsc deadline timer emulation, + *and guest use qemu apic, add cpuid exposing case then. + */ +env-cpuid_ext_features = ~CPUID_EXT_TSC_DEADLINE_TIMER; +if (tsc_deadline_timer) { +if (kvm_irqchip_in_kernel() +kvm_check_extension(s, KVM_CAP_TSC_DEADLINE_TIMER)) +env-cpuid_ext_features |= CPUID_EXT_TSC_DEADLINE_TIMER; +} env-cpuid_ext2_features = kvm_arch_get_supported_cpuid(s, 0x8001, 0, R_EDX); diff --git a/vl.c b/vl.c index 89a618b..583de58 100644 --- a/vl.c +++ b/vl.c @@ -2160,6 +2160,7 @@ int kvm_irqchip = 1; int kvm_pit = 1; int kvm_pit_reinject = 1; #endif +int tsc_deadline_timer = 1; int main(int argc, char **argv, char **envp) { @@ -2875,6 +2876,9 @@ int main(int argc, char **argv, char **envp) break; } #endif +case QEMU_OPTION_no_tsc_deadline_timer: +tsc_deadline_timer = 0; +break; case QEMU_OPTION_usb: usb_enabled = 1; break; -- 1.6.5.6 0001-Expose-tsc-deadline-timer-cpuid-to-guest.patch Description: 0001-Expose-tsc-deadline-timer-cpuid-to-guest.patch
[PATCH] X86: expose latest Intel cpu new features to guest
From 8bb5d052825149c211afa92458912bc49a50ee2f Mon Sep 17 00:00:00 2001 From: Liu, Jinsong jinsong@intel.com Date: Mon, 28 Nov 2011 03:55:19 -0800 Subject: [PATCH] X86: expose latest Intel cpu new features to guest Intel latest cpu add 6 new features, refer http://software.intel.com/file/36945 The new feature cpuid listed as below: 1. FMA CPUID.EAX=01H:ECX.FMA[bit 12] 2. MOVBECPUID.EAX=01H:ECX.MOVBE[bit 22] 3. BMI1 CPUID.EAX=07H,ECX=0H:EBX.BMI1[bit 3] 4. AVX2 CPUID.EAX=07H,ECX=0H:EBX.AVX2[bit 5] 5. BMI2 CPUID.EAX=07H,ECX=0H:EBX.BMI2[bit 8] 6. LZCNTCPUID.EAX=8001H:ECX.LZCNT[bit 5] This patch expose these features to guest. Among them, FMA/MOVBE/LZCNT has already been defined, MOVBE/LZCNT has already been exposed. This patch defines BMI1/AVX2/BMI2, and exposes FMA/BMI1/AVX2/BMI2 to guest. Signed-off-by: Liu, Jinsong jinsong@intel.com --- arch/x86/include/asm/cpufeature.h |3 +++ arch/x86/kvm/x86.c|4 ++-- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index f3444f7..17c5d4b 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -197,7 +197,10 @@ /* Intel-defined CPU features, CPUID level 0x0007:0 (ebx), word 9 */ #define X86_FEATURE_FSGSBASE (9*32+ 0) /* {RD/WR}{FS/GS}BASE instructions*/ +#define X86_FEATURE_BMI1 (9*32+ 3) /* 1st group bit manipulation extensions */ +#define X86_FEATURE_AVX2 (9*32+ 5) /* AVX2 instructions */ #define X86_FEATURE_SMEP (9*32+ 7) /* Supervisor Mode Execution Protection */ +#define X86_FEATURE_BMI2 (9*32+ 8) /* 2nd group bit manipulation extensions */ #define X86_FEATURE_ERMS (9*32+ 9) /* Enhanced REP MOVSB/STOSB */ #if defined(__KERNEL__) !defined(__ASSEMBLY__) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 1985ea1..22255bc 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2443,7 +2443,7 @@ static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, F(XMM3) | F(PCLMULQDQ) | 0 /* DTES64, MONITOR */ | 0 /* DS-CPL, VMX, SMX, EST */ | 0 /* TM2 */ | F(SSSE3) | 0 /* CNXT-ID */ | 0 /* Reserved */ | - 0 /* Reserved */ | F(CX16) | 0 /* xTPR Update, PDCM */ | + F(FMA) | F(CX16) | 0 /* xTPR Update, PDCM */ | 0 /* Reserved, DCA */ | F(XMM4_1) | F(XMM4_2) | F(X2APIC) | F(MOVBE) | F(POPCNT) | 0 /* Reserved*/ | F(AES) | F(XSAVE) | 0 /* OSXSAVE */ | F(AVX) | @@ -2463,7 +2463,7 @@ static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* cpuid 7.0.ebx */ const u32 kvm_supported_word9_x86_features = - F(SMEP) | F(FSGSBASE) | F(ERMS); + F(FSGSBASE) | F(BMI1) | F(AVX2) | F(SMEP) | F(BMI2) | F(ERMS); /* all calls to cpuid_count() should be made on the same cpu */ get_cpu(); -- 1.5.6 kvm-expose_intel_new_features.patch Description: kvm-expose_intel_new_features.patch
RE: [PATCH] apic: test tsc deadline timer
Avi Kivity wrote: On 10/09/2011 05:32 PM, Liu, Jinsong wrote: Updated test case for kvm tsc deadline timer https://github.com/avikivity/kvm-unit-tests, as attached. Applied, thanks. Which tree? I didn't find it at git://github.com/avikivity/kvm-unit-tests.git Thanks, Jinsong-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [QEMU PATCH] kvm: support TSC deadline MSR with subsection
Marcelo Tosatti wrote: On Wed, Oct 12, 2011 at 12:26:12PM +0800, Liu, Jinsong wrote: Marcelo, I just test guest migration from v13 to v12, it failed w/ info savevm: unsupported version 13 for 'cpu' v12 load of migration failed v13 is new qemu-kvm with tsc deadline timer co-work patch, v12 is old qemu-kvm. You should try the patch in the first message in this thread, which is a replacement for the original tsc deadline timer patch. Sorry, I didn't notice your modification. Just test the modified version, it worked OK when migrate from new qemu (w/ tsc deadline timer patch) to old qemu. Thanks, Jinsong = From: Liu, Jinsong jinsong@intel.com KVM add emulation of lapic tsc deadline timer for guest. This patch is co-operation work at qemu side. Use subsections to save/restore the field (mtosatti). Signed-off-by: Liu, Jinsong jinsong@intel.com diff --git a/target-i386/cpu.h b/target-i386/cpu.h index ae36489..29412dc 100644 --- a/target-i386/cpu.h +++ b/target-i386/cpu.h @@ -283,6 +283,7 @@ #define MSR_IA32_APICBASE_BSP (18) #define MSR_IA32_APICBASE_ENABLE(111) #define MSR_IA32_APICBASE_BASE (0xf12) +#define MSR_IA32_TSCDEADLINE0x6e0 #define MSR_MTRRcap0xfe #define MSR_MTRRcap_VCNT 8 @@ -687,6 +688,7 @@ typedef struct CPUX86State { uint64_t async_pf_en_msr; uint64_t tsc; +uint64_t tsc_deadline; uint64_t mcg_status; diff --git a/target-i386/kvm.c b/target-i386/kvm.c index b6eef04..90a6ffb 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -59,6 +59,7 @@ const KVMCapabilityInfo kvm_arch_required_capabilities[] = { static bool has_msr_star; static bool has_msr_hsave_pa; +static bool has_msr_tsc_deadline; static bool has_msr_async_pf_en; static int lm_capable_kernel; @@ -568,6 +569,10 @@ static int kvm_get_supported_msrs(KVMState *s) has_msr_hsave_pa = true; continue; } +if (kvm_msr_list-indices[i] == MSR_IA32_TSCDEADLINE) { +has_msr_tsc_deadline = true; +continue; +} } } @@ -881,6 +886,9 @@ static int kvm_put_msrs(CPUState *env, int level) if (has_msr_hsave_pa) { kvm_msr_entry_set(msrs[n++], MSR_VM_HSAVE_PA, env-vm_hsave); } +if (has_msr_tsc_deadline) { +kvm_msr_entry_set(msrs[n++], MSR_IA32_TSCDEADLINE, env-tsc_deadline); +} #ifdef TARGET_X86_64 if (lm_capable_kernel) { kvm_msr_entry_set(msrs[n++], MSR_CSTAR, env-cstar); @@ -1127,6 +1135,9 @@ static int kvm_get_msrs(CPUState *env) if (has_msr_hsave_pa) { msrs[n++].index = MSR_VM_HSAVE_PA; } +if (has_msr_tsc_deadline) { +msrs[n++].index = MSR_IA32_TSCDEADLINE; +} if (!env-tsc_valid) { msrs[n++].index = MSR_IA32_TSC; @@ -1195,6 +1206,9 @@ static int kvm_get_msrs(CPUState *env) case MSR_IA32_TSC: env-tsc = msrs[i].data; break; +case MSR_IA32_TSCDEADLINE: +env-tsc_deadline = msrs[i].data; +break; case MSR_VM_HSAVE_PA: env-vm_hsave = msrs[i].data; break; diff --git a/target-i386/machine.c b/target-i386/machine.c index 9aca8e0..176d372 100644 --- a/target-i386/machine.c +++ b/target-i386/machine.c @@ -310,6 +310,24 @@ static const VMStateDescription vmstate_fpop_ip_dp = { } }; +static bool tscdeadline_needed(void *opaque) +{ +CPUState *env = opaque; + +return env-tsc_deadline != 0; +} + +static const VMStateDescription vmstate_msr_tscdeadline = { +.name = cpu/msr_tscdeadline, +.version_id = 1, +.minimum_version_id = 1, +.minimum_version_id_old = 1, +.fields = (VMStateField []) { +VMSTATE_UINT64(tsc_deadline, CPUState), +VMSTATE_END_OF_LIST() +} +}; + static const VMStateDescription vmstate_cpu = { .name = cpu, .version_id = CPU_SAVE_VERSION, @@ -420,6 +438,9 @@ static const VMStateDescription vmstate_cpu = { } , { .vmsd = vmstate_fpop_ip_dp, .needed = fpop_ip_dp_needed, +}, { +.vmsd = vmstate_msr_tscdeadline, +.needed = tscdeadline_needed, } , { /* empty */ } -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [QEMU PATCH] kvm: support TSC deadline MSR with subsection
Marcelo, I just test guest migration from v13 to v12, it failed w/ info savevm: unsupported version 13 for 'cpu' v12 load of migration failed v13 is new qemu-kvm with tsc deadline timer co-work patch, v12 is old qemu-kvm. Marcelo Tosatti wrote: Jinsong, please test this qemu-kvm patch by migrating a guest which is currently using TSC deadline timer. Using subsections avoids breaking migration to older qemu versions when the guest does not make use of TSC deadline feature. Is subsection used to avoid breaking migration to older qemu? Thanks, Jinsong-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html