Re: [PATCH] KVM: x86: nVMX: support for MSR loading/storing
GCC doesn't warn that ((u32)e-index 24) == 0x800 is always false? I think SDM says '(e-index 8) == 0x8'. Missed that. Thank you. -- Eugene -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: x86: nVMX: support for MSR loading/storing
Hi, Eugene, is it okay to split my part up? I think the patch is atomic. No ideas how this patch could be split without breaking its integrity. You are a co-author of the patch since your ideas make significant part of it. -- Eugene -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: x86: nVMX: support for MSR loading/storing
Will send fixed patch this evening. -- Eugene -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Windows 7 VM BSOD
On Wed, 2014-12-10 at 15:42 +0800, Thomas Lau wrote: I briefly tested Penryn, Westmere. Bug still could reproduce. It should be four parameters printed on the screen, right below the error code string. Could you please post them? how could I set level, model and enforce on libvirt ?! I could also test it if you could tell me how to add those options on libvirtd. Sorry, have no idea how to deal with libvirt. On Wed, Dec 10, 2014 at 2:19 PM, Vadim Rozenfeld vroze...@redhat.com wrote: On Wed, 2014-12-10 at 08:51 +0800, t...@tetrioncapital.com wrote: Hi, Anything you want me to try on my side? There is an open bug in bugzilla which looks pretty similar to your problem https://bugzilla.redhat.com/show_bug.cgi?id=1139928 Please take a look at comment #18 posted by Eduardo https://bugzilla.redhat.com/show_bug.cgi?id=1139928#c18 Best regards, Vadim. Sent from my BlackBerry 10 smartphone. Thomas Lau Director of Infrastructure Tetrion Capital Limited Direct: +852-3976-8903 Mobile: +852-9323-9670 Address: 20/F, IFC 1, Central district, Hong Kong Original Message From: Thomas Lau Sent: Tuesday, 9 December, 2014 4:24 PM To: Vadim Rozenfeld Cc: Zhang Haoyu; kvm; imammedo Subject: Re: Windows 7 VM BSOD Hi Vadim, Now turning on is OK somehow, shutdown still stuck. On Tue, Dec 9, 2014 at 4:03 PM, Vadim Rozenfeld vroze...@redhat.com wrote: On Tue, 2014-12-09 at 15:54 +0800, Thomas Lau wrote: I changed CPU type to Westmere, it boot up with 0x05C BSOD It should be four parameters printed on the screen, right below the error code string. Could you please post them? Vadim. On Tue, Dec 9, 2014 at 3:10 PM, Vadim Rozenfeld vroze...@redhat.com wrote: On Tue, 2014-12-09 at 11:54 +0800, Thomas Lau wrote: Hi Vadim, I want to quote back to your original post back in early 2014: https://www.mail-archive.com/kvm@vger.kernel.org/msg99782.html According to http://msdn.microsoft.com/en-us/library/windows/hardware/ff559069(v=vs.85).aspx the 0x5C means HAL_INITIALIZATION_FAILED Problem matched exactly, which I am using CPU IvyBridge-EP and I got same BSOD as well. Some CPU flags (feature bits) should be missing. Can you try changing cpu type? Best regards, Vadim. Are we missing some hyperv feature? On Wed, Dec 3, 2014 at 7:29 PM, Vadim Rozenfeld vroze...@redhat.com wrote: If you run WS2008(R2) or Win7 - always turn on relaxed timing. Otherwise it's just a matter of time when you hit 101 BOSD. Bugcheck 78 is quite rare one. What is your setup, and how easy it's reproducible? Best regards, Vadim. On Wed, 2014-12-03 at 19:13 +0800, Thomas Lau wrote: it works on your side meaning that you had such issue but afterwards it's all fixed by apply hv_relaxed ? On Wed, Dec 3, 2014 at 7:08 PM, Zhang Haoyu zhan...@sangfor.com wrote: https://bugzilla.redhat.com/show_bug.cgi?id=893857 In fact I am doing testing now, but are we fixing one problem and introduce other problem?! I'm not sure about this, but it works on my side, I think BSOD(error:0x0078) has been fixed, please show your environment. Thanks, Zhang Haoyu On Wed, Dec 3, 2014 at 6:36 PM, Thomas Lau t...@tetrioncapital.com wrote: Hi, How do I know if my qemu-kvm version support this? On Wed, Dec 3, 2014 at 6:25 PM, Zhang Haoyu zhan...@sangfor.com wrote: Hi All, I am running 3.13.0-24-generic kernel on Ubuntu 14, Windows 7 VM installation was fine, but it does random reboot by itself, the error code is 0x0101, does anyone know how to fix this? Could you try hv_relaxed, like -cpu kvm64,hv_relaxed. Thanks, Zhang Haoyu -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/2] KVM: x86: Emulator fixes for VM86
Two minor fixes for emulation of instructions on VM86. Thanks for reviewing them. Nadav Amit (2): KVM: x86: Do not push eflags.vm on pushf KVM: x86: Emulate should check #UD before #GP arch/x86/kvm/emulate.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] KVM: x86: Emulate should check #UD before #GP
Intel SDM table 6-2 (Priority Among Simultaneous Exceptions and Interrupts) shows that faults from decoding the next instruction got higher priority than general protection. Moving the protected-mode check before the CPL check to avoid wrong exception on vm86 mode. Signed-off-by: Nadav Amit na...@cs.technion.ac.il --- arch/x86/kvm/emulate.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index 5cd5401..0d42aca 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -4803,6 +4803,12 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt) goto done; } + /* Instruction can only be executed in protected mode */ + if ((ctxt-d Prot) ctxt-mode X86EMUL_MODE_PROT16) { + rc = emulate_ud(ctxt); + goto done; + } + /* Privileged instruction can be executed only in CPL=0 */ if ((ctxt-d Priv) ops-cpl(ctxt)) { if (ctxt-d PrivUD) @@ -4812,12 +4818,6 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt) goto done; } - /* Instruction can only be executed in protected mode */ - if ((ctxt-d Prot) ctxt-mode X86EMUL_MODE_PROT16) { - rc = emulate_ud(ctxt); - goto done; - } - /* Do instruction specific permission checks */ if (ctxt-d CheckPerm) { rc = ctxt-check_perm(ctxt); -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] KVM: x86: Do not push eflags.vm on pushf
The pushf instruction does not push eflags.VM, so emulation should not do so as well. Although eflags.RF should not be pushed as well, it is already cleared by the time pushf is executed. Signed-off-by: Nadav Amit na...@cs.technion.ac.il --- arch/x86/kvm/emulate.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index 33ecfcf..5cd5401 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -1863,7 +1863,7 @@ static int em_pusha(struct x86_emulate_ctxt *ctxt) static int em_pushf(struct x86_emulate_ctxt *ctxt) { - ctxt-src.val = (unsigned long)ctxt-eflags; + ctxt-src.val = (unsigned long)ctxt-eflags ~EFLG_VM; return em_push(ctxt); } -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: x86: nVMX: support for MSR loading/storing
Hi all, On Wed, Dec 10, 2014 at 08:07:45AM -0100, Eugene Korenevsky wrote: Hi, Eugene, is it okay to split my part up? I think the patch is atomic. No ideas how this patch could be split without breaking its integrity. You are a co-author of the patch since your ideas make significant part of it. Since Wincy send out his patch before you, I prefer he send out a newer version which fix issues in his own patch, then you can send out another enhanced one based on Wincy's work. Regards, Wanpeng Li -- Eugene -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: x86: nVMX: support for MSR loading/storing
On Wed, Dec 10, 2014 at 08:13:58AM -0100, Eugene Korenevsky wrote: Will send fixed patch this evening. Please see my reply to another thread. -- Eugene -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: x86: nVMX: support for MSR loading/storing
2014-12-10 17:01 GMT+08:00 Wanpeng Li wanpeng...@linux.intel.com: Hi all, On Wed, Dec 10, 2014 at 08:07:45AM -0100, Eugene Korenevsky wrote: Hi, Eugene, is it okay to split my part up? I think the patch is atomic. No ideas how this patch could be split without breaking its integrity. You are a co-author of the patch since your ideas make significant part of it. Since Wincy send out his patch before you, I prefer he send out a newer version which fix issues in his own patch, then you can send out another enhanced one based on Wincy's work. Ok, I will send out the version two ASAP, thanks. Wincy -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: x86: nVMX: support for MSR loading/storing
Eugene Korenevsky ekorenev...@gmail.com writes: Hi, Eugene, is it okay to split my part up? I think the patch is atomic. No ideas how this patch could be split without breaking its integrity. You are a co-author of the patch since your ideas make significant part of it. I was suggesting adding the interfaces you introduced in the first patch and then using these interfaces in the second patch to make reviewing easier. It's ok to mention that the second depends on the first. If Wincy has code contributions to the patch, he should sign it off too, else maybe add a Suggested-by to give him credit for his ideas. Also, please include a v3 in the Subject when you submit your next version. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/5] arm/arm64: KVM: vgic: move reset initialization into vgic_init_maps()
Hi Christoffer, Reviewed-by: Eric Auger eric.au...@linaro.org see few comments below. On 12/09/2014 04:44 PM, Christoffer Dall wrote: From: Peter Maydell peter.mayd...@linaro.org VGIC initialization currently happens in three phases: (1) kvm_vgic_create() (triggered by userspace GIC creation) (2) vgic_init_maps() (triggered by userspace GIC register read/write requests, or from kvm_vgic_init() if not already run) (3) kvm_vgic_init() (triggered by first VM run) We were doing initialization of some state to correspond with the state of a freshly-reset GIC in kvm_vgic_init(); this is too late, since it will overwrite changes made by userspace using the register access APIs before the VM is run. Move this initialization earlier, into the vgic_init_maps() phase. This fixes a bug where QEMU could successfully restore a saved VM state snapshot into a VM that had already been run, but could not restore it from cold using the -loadvm command line option (the symptoms being that the restored VM would run but interrupts were ignored). Finally rename vgic_init_maps to vgic_init and renamed kvm_vgic_init to kvm_vgic_map_resources. [ This patch is originally written by Peter Maydell, but I have modified it somewhat heavily, renaming various bits and moving code around. If something is broken, I am to be blamed. - Christoffer ] Signed-off-by: Peter Maydell peter.mayd...@linaro.org Signed-off-by: Christoffer Dall christoffer.d...@linaro.org --- This patch was originally named vgic: move reset initialization into vgic_init_maps() but I renamed it slightly to match the other vgic patches in the kernel. I also did the additional changes since the original patch: - Renamed kvm_vgic_init to kvm_vgic_map_resources - Renamed vgic_init_maps to vgic_init - Moved vgic_enable call into existing vcpu loop in vgic_init - Moved ITARGETSRn initializtion above vcpu loop in vgic_init (the idea typo is to init global state first, then vcpu state). kvm_vgic_vcpu_init also has disappeared and PPI settings of dist-irq_enabled and dist-irq_cfg now are in former vgic_init_maps. Maybe it would be simpler to review if there were 2 patches: one for init redistribution from kvm_vgic_init to vgic_init_maps and one for the renaming. kvm_vgic_map_resources: difficult to understand it also inits the internal states. Wouldn't kvm_vgic_set_ready be aligned with ready terminology? Best Regards Eric - Added comment in kvm_vgic_map_resources arch/arm/kvm/arm.c | 6 ++-- include/kvm/arm_vgic.h | 4 +-- virt/kvm/arm/vgic.c| 77 +- 3 files changed, 37 insertions(+), 50 deletions(-) diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c index 9e193c8..a56cbb5 100644 --- a/arch/arm/kvm/arm.c +++ b/arch/arm/kvm/arm.c @@ -427,11 +427,11 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu) vcpu-arch.has_run_once = true; /* - * Initialize the VGIC before running a vcpu the first time on - * this VM. + * Map the VGIC hardware resources before running a vcpu the first + * time on this VM. */ if (unlikely(!vgic_initialized(vcpu-kvm))) { - ret = kvm_vgic_init(vcpu-kvm); + ret = kvm_vgic_map_resources(vcpu-kvm); if (ret) return ret; } diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h index 206dcc3..fe9783b 100644 --- a/include/kvm/arm_vgic.h +++ b/include/kvm/arm_vgic.h @@ -274,7 +274,7 @@ struct kvm_exit_mmio; #ifdef CONFIG_KVM_ARM_VGIC int kvm_vgic_addr(struct kvm *kvm, unsigned long type, u64 *addr, bool write); int kvm_vgic_hyp_init(void); -int kvm_vgic_init(struct kvm *kvm); +int kvm_vgic_map_resources(struct kvm *kvm); int kvm_vgic_create(struct kvm *kvm); void kvm_vgic_destroy(struct kvm *kvm); void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu); @@ -321,7 +321,7 @@ static inline int kvm_vgic_addr(struct kvm *kvm, unsigned long type, u64 *addr, return -ENXIO; } -static inline int kvm_vgic_init(struct kvm *kvm) +static inline int kvm_vgic_map_resources(struct kvm *kvm) { return 0; } diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c index aacdb59..91e6bfc 100644 --- a/virt/kvm/arm/vgic.c +++ b/virt/kvm/arm/vgic.c @@ -91,6 +91,7 @@ #define ACCESS_WRITE_VALUE (3 1) #define ACCESS_WRITE_MASK(x) ((x) (3 1)) +static int vgic_init(struct kvm *kvm); static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu); static void vgic_retire_lr(int lr_nr, int irq, struct kvm_vcpu *vcpu); static void vgic_update_state(struct kvm *kvm); @@ -1726,39 +1727,14 @@ static int vgic_vcpu_init_maps(struct kvm_vcpu *vcpu, int nr_irqs) int sz = (nr_irqs - VGIC_NR_PRIVATE_IRQS) / 8; vgic_cpu-pending_shared = kzalloc(sz, GFP_KERNEL); - vgic_cpu-vgic_irq_lr_map = kzalloc(nr_irqs, GFP_KERNEL); +
Re: [PATCH 3/5] arm/arm64: KVM: Add (new) vgic_initialized macro
On 12/09/2014 04:44 PM, Christoffer Dall wrote: Some code paths will need to check to see if the internal state of the vgic has been initialized (such as when creating new VCPUs), so introduce such a macro that checks the nr_cpus field which is set when the vgic has been initialized. Also set nr_cpus = 0 in kvm_vgic_destroy, because the error path in vgic_init() will call this function, and code should never errornously assume the vgic to be properly initialized after an error. Signed-off-by: Christoffer Dall christoffer.d...@linaro.org --- include/kvm/arm_vgic.h | 6 ++ virt/kvm/arm/vgic.c| 1 + 2 files changed, 7 insertions(+) diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h index 3e262b9..ac4888d 100644 --- a/include/kvm/arm_vgic.h +++ b/include/kvm/arm_vgic.h @@ -287,6 +287,7 @@ bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run, struct kvm_exit_mmio *mmio); #define irqchip_in_kernel(k) (!!((k)-arch.vgic.in_kernel)) +#define vgic_initialized(k) (!!((k)-arch.vgic.nr_cpus)) #define vgic_ready(k)((k)-arch.vgic.ready) int vgic_v2_probe(struct device_node *vgic_node, @@ -369,6 +370,11 @@ static inline int irqchip_in_kernel(struct kvm *kvm) return 0; } +static inline bool vgic_initialized(struct kvm *kvm) +{ + return true; +} + static inline bool vgic_ready(struct kvm *kvm) { return true; diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c index 6293349..c98cc6b 100644 --- a/virt/kvm/arm/vgic.c +++ b/virt/kvm/arm/vgic.c @@ -1774,6 +1774,7 @@ void kvm_vgic_destroy(struct kvm *kvm) dist-irq_spi_cpu = NULL; dist-irq_spi_target = NULL; dist-irq_pending_on_cpu = NULL; + dist-nr_cpus = 0; Reviewed-by: Eric Auger eric.au...@linaro.org we could use that new vgic_initialized at the entry of vgic_init instead of testing dist-nr_cpus, hence introducing one user. Eric } /* -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] x86, kvm: Clear paravirt_enabled on KVM guests for espfix32's benefit
On 06/12/2014 04:03, Andy Lutomirski wrote: paravirt_enabled has the following effects: - Disables the F00F bug workaround warning. There is no F00F bug workaround any more because Linux's standard IDT handling already works around the F00F bug, but the warning still exists. This is only cosmetic, and, in any event, there is no such thing as KVM on a CPU with the F00F bug. - Disables 32-bit APM BIOS detection. On a KVM paravirt system, there should be no APM BIOS anyway. - Disables tboot. I think that the tboot code should check the CPUID hypervisor bit directly if it matters. - paravirt_enabled disables espfix32. espfix32 should *not* be disabled under KVM paravirt. The last point is the purpose of this patch. It fixes a leak of the high 16 bits of the kernel stack address on 32-bit KVM paravirt guests. While I'm at it, this removes pv_info setup from kvmclock. That code seems to serve no purpose. kvmclock_init runs before kvm_guest_init, and this is a stable@ patch so for the sake of extra safety I've left the pv_info.name assignment in. Applied (locally for now), will be in 3.19. Paolo Cc: sta...@vger.kernel.org Signed-off-by: Andy Lutomirski l...@amacapital.net --- arch/x86/kernel/kvm.c | 9 - arch/x86/kernel/kvmclock.c | 2 -- 2 files changed, 8 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index f6945bef2cd1..94f643484300 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -283,7 +283,14 @@ NOKPROBE_SYMBOL(do_async_page_fault); static void __init paravirt_ops_setup(void) { pv_info.name = KVM; - pv_info.paravirt_enabled = 1; + + /* + * KVM isn't paravirt in the sense of paravirt_enabled. A KVM + * guest kernel works like a bare metal kernel with additional + * features, and paravirt_enabled is about features that are + * missing. + */ + pv_info.paravirt_enabled = 0; if (kvm_para_has_feature(KVM_FEATURE_NOP_IO_DELAY)) pv_cpu_ops.io_delay = kvm_io_delay; diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c index d9156ceecdff..d4d9a8ad7893 100644 --- a/arch/x86/kernel/kvmclock.c +++ b/arch/x86/kernel/kvmclock.c @@ -263,8 +263,6 @@ void __init kvmclock_init(void) #endif kvm_get_preset_lpj(); clocksource_register_hz(kvm_clock, NSEC_PER_SEC); - pv_info.paravirt_enabled = 1; - pv_info.name = KVM; if (kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE_STABLE_BIT)) pvclock_set_flags(PVCLOCK_TSC_STABLE_BIT); -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: x86: Remove prefix flag when GP macro is used
On 07/12/2014 10:49, Nadav Amit wrote: The macro GP already sets the flag Prefix. Remove the redundant flag for 0f_38_f0 and 0f_38_f1 opcodes. Signed-off-by: Nadav Amit na...@cs.technion.ac.il --- arch/x86/kvm/emulate.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index 3817334..b4f4201 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -4172,8 +4172,8 @@ static const struct opcode opcode_map_0f_38[256] = { /* 0x80 - 0xef */ X16(N), X16(N), X16(N), X16(N), X16(N), X16(N), X16(N), /* 0xf0 - 0xf1 */ - GP(EmulateOnUD | ModRM | Prefix, three_byte_0f_38_f0), - GP(EmulateOnUD | ModRM | Prefix, three_byte_0f_38_f1), + GP(EmulateOnUD | ModRM, three_byte_0f_38_f0), + GP(EmulateOnUD | ModRM, three_byte_0f_38_f1), /* 0xf2 - 0xff */ N, N, X4(N), X8(N) }; Applied, thanks. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] KVM: x86: Emulator fixes for VM86
On 10/12/2014 10:19, Nadav Amit wrote: Two minor fixes for emulation of instructions on VM86. Thanks for reviewing them. Nadav Amit (2): KVM: x86: Do not push eflags.vm on pushf KVM: x86: Emulate should check #UD before #GP arch/x86/kvm/emulate.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) Applied, thanks. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: nVMX: Disable unrestricted mode if ept=0
On 06/12/2014 16:02, Bandan Das wrote: If L0 has disabled EPT, don't advertise unrestricted mode at all since it depends on EPT to run real mode code. Signed-off-by: Bandan Das b...@redhat.com --- arch/x86/kvm/vmx.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 3e556c6..ed70394 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -2377,12 +2377,12 @@ static __init void nested_vmx_setup_ctls_msrs(void) nested_vmx_secondary_ctls_low = 0; nested_vmx_secondary_ctls_high = SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES | - SECONDARY_EXEC_UNRESTRICTED_GUEST | SECONDARY_EXEC_WBINVD_EXITING; if (enable_ept) { /* nested EPT: emulate EPT also to L1 */ - nested_vmx_secondary_ctls_high |= SECONDARY_EXEC_ENABLE_EPT; + nested_vmx_secondary_ctls_high |= SECONDARY_EXEC_ENABLE_EPT | + SECONDARY_EXEC_UNRESTRICTED_GUEST; nested_vmx_ept_caps = VMX_EPT_PAGE_WALK_4_BIT | VMX_EPTP_WB_BIT | VMX_EPT_2MB_PAGE_BIT | VMX_EPT_INVEPT_BIT; Thanks, applied with Fixes: 92fbc7b195b824e201d9f06f2b93105f72384d65 Cc: sta...@vger.kernel.org Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/5] arm/arm64: KVM: Don't allow creating VCPUs after vgic_initialized
On 12/09/2014 04:44 PM, Christoffer Dall wrote: When the vgic initializes its internal state it does so based on the number of VCPUs available at the time. If we allow KVM to create more VCPUs after the VGIC has been initialized, we are likely to error out in unfortunate ways later, perform buffer overflows etc. Cc: Eric Auger eric.au...@linaro.org Signed-off-by: Christoffer Dall christoffer.d...@linaro.org --- This replaces Eric Auger's previous patch (https://lists.cs.columbia.edu/pipermail/kvmarm/2014-December/012646.html), because it fits better with testing to include it in this series and I realized that we need to add a check against irqchip_in_kernel() as well. arch/arm/kvm/arm.c | 5 + 1 file changed, 5 insertions(+) diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c index a9d005f..d4da244 100644 --- a/arch/arm/kvm/arm.c +++ b/arch/arm/kvm/arm.c @@ -213,6 +213,11 @@ struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm, unsigned int id) int err; struct kvm_vcpu *vcpu; + if (irqchip_in_kernel(kvm) vgic_initialized(kvm)) { Reviewed-by: Eric Auger eric.au...@linaro.org a question about that irqchip_in_kernel(kvm): kvm-arch.vgic.in_kernel is set in kvm_vgic_create but nobody resets it, especially in destroy, am i wrong? if the vgic is initialized shouldn't it be also created? Shouldn't we test irqchip_in_kernel in vgic_init instead? Also in case we need irqchip_in_kernel(kvm) here we might need it also in kvm_vgic_inject_irq because dist-lock is grabbed in vgic_update_irq_pending. Eric + err = -EBUSY; + goto out; + } + vcpu = kmem_cache_zalloc(kvm_vcpu_cache, GFP_KERNEL); if (!vcpu) { err = -ENOMEM; -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 5/5] arm/arm64: KVM: Initialize the vgic on-demand when injecting IRQs
On 12/09/2014 04:44 PM, Christoffer Dall wrote: Userspace assumes that it can wire up IRQ injections after having created all VCPUs and after having created the VGIC, but potentially before starting the first VCPU. This can currently lead to lost IRQs because the state of that IRQ injection is not stored anywhere and we don't return an error to userspace. We haven't seen this problem manifest itself yet, Actually we did with VFIO signaling setup before VGIC init! presumably because guests reset the devices on boot, but this could cause issues with migration and other non-standard startup configurations. Signed-off-by: Christoffer Dall christoffer.d...@linaro.org --- virt/kvm/arm/vgic.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c index c98cc6b..feef015 100644 --- a/virt/kvm/arm/vgic.c +++ b/virt/kvm/arm/vgic.c @@ -1693,8 +1693,13 @@ out: int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num, bool level) { - if (likely(vgic_ready(kvm)) - vgic_update_irq_pending(kvm, cpuid, irq_num, level)) + if (unlikely(!vgic_initialized(kvm))) { + mutex_lock(kvm-lock); + vgic_init(kvm); + mutex_unlock(kvm-lock); + } I was previously encouraged to test the virtual interrupt controller readiness when setting irqfd up(proposal made in https://lkml.org/lkml/2014/12/3/601). I guess this becomes useless now, correct? Reviewed-by on the whole series. Eric + + if (vgic_update_irq_pending(kvm, cpuid, irq_num, level)) vgic_kick_vcpus(kvm); return 0; -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] x86: Remove Fix Mes in emulate.c from needing fault addresses
On 08/12/2014 04:18, nick wrote: Paolo, Not to be annoying but I am wondering, if my patch has been merged as I have yet to see it in the mainline kernel. It will be sent to Linus during the merge window. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[QEMU patch 1/2] kvm: sync kernel headers
Signed-off-by: Marcelo Tosatti mtosa...@redhat.com --- linux-headers/asm-x86/kvm.h |5 + linux-headers/linux/kvm.h | 14 +++--- 2 files changed, 8 insertions(+), 11 deletions(-) Index: qemu/linux-headers/asm-x86/kvm.h === --- qemu.orig/linux-headers/asm-x86/kvm.h 2014-12-08 17:54:33.647488264 -0200 +++ qemu/linux-headers/asm-x86/kvm.h2014-12-09 13:27:20.749752962 -0200 @@ -277,6 +277,11 @@ __u8 reserved[31]; }; +struct kvm_tscdeadline_advance { + __u32 timer_advance; + __u32 reserved[3]; +}; + /* When set in flags, include corresponding fields on KVM_SET_VCPU_EVENTS */ #define KVM_VCPUEVENT_VALID_NMI_PENDING0x0001 #define KVM_VCPUEVENT_VALID_SIPI_VECTOR0x0002 Index: qemu/linux-headers/linux/kvm.h === --- qemu.orig/linux-headers/linux/kvm.h 2014-12-08 17:54:33.647488264 -0200 +++ qemu/linux-headers/linux/kvm.h 2014-12-09 13:27:20.750752961 -0200 @@ -761,6 +753,7 @@ #define KVM_CAP_PPC_FIXUP_HCALL 103 #define KVM_CAP_PPC_ENABLE_HCALL 104 #define KVM_CAP_CHECK_EXTENSION_VM 105 +#define KVM_CAP_TSCDEADLINE_ADVANCE 106 #ifdef KVM_CAP_IRQ_ROUTING @@ -1061,6 +1054,8 @@ #define KVM_GET_DEVICE_ATTR _IOW(KVMIO, 0xe2, struct kvm_device_attr) #define KVM_HAS_DEVICE_ATTR _IOW(KVMIO, 0xe3, struct kvm_device_attr) +#define KVM_SET_TSCDEADLINE_ADVANCE _IOW(KVMIO, 0xe4, struct kvm_tscdeadline_advance) + /* * ioctls for vcpu fds */ -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[QEMU patch 2/2] kvm: allow configuration of tsc deadline timer advancement
Add machine option and QMP commands to configure TSC deadline timer advancement. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com --- monitor.c | 15 ++ qapi-schema.json | 29 +++ qmp-commands.hx | 48 target-i386/kvm.c | 80 ++ vl.c |4 ++ 5 files changed, 176 insertions(+) Index: qemu.tscdeadline/qapi-schema.json === --- qemu.tscdeadline.orig/qapi-schema.json +++ qemu.tscdeadline/qapi-schema.json @@ -3515,3 +3515,32 @@ # Since: 2.1 ## { 'command': 'rtc-reset-reinjection' } + +## +# @set-lapic-tscdeadline-advance +# +# This command sets the TSC deadline timer advancement. +# This value will be subtracted from the expiration time +# of the high resolution timer which emulates +# TSC deadline timer. +# +# Useful to achieve low timer latencies. +# +# Only supported by KVM acceleration. +# +# Since: 2.3 +## +{ 'command': 'set-lapic-tscdeadline-advance', + 'data': { 'advance':'int' } +} + +## +# @get-lapic-tscdeadline-advance +# +# This command gets the TSC deadline timer advancement. +# +# Only supported by KVM acceleration. +# +# Since: 2.3 +## +{ 'command': 'get-lapic-tscdeadline-advance', 'returns': 'int' } Index: qemu.tscdeadline/qmp-commands.hx === --- qemu.tscdeadline.orig/qmp-commands.hx +++ qemu.tscdeadline/qmp-commands.hx @@ -3854,3 +3854,51 @@ Move mouse pointer to absolute coordinat - { return: {} } EQMP + +{ +.name = set-lapic-tscdeadline-advance, +.args_type = advance:i, +.mhandler.cmd_new = qmp_marshal_input_set_lapic_tscdeadline_advance, +}, + +SQMP +set-lapic-tscdeadline-advance +- + +Set LAPIC tscdeadline timer advancement, in nanoseconds. + +Arguments: + +- advance: LAPIC tscdeadline timer advancement (json-int) + +Example: + +- { execute: set-lapic-tscdeadline-advance 1000 } +- { return: {} } + +EQMP + +{ +.name = get-lapic-tscdeadline-advance, +.args_type = , +.mhandler.cmd_new = qmp_marshal_input_get_lapic_tscdeadline_advance, +}, + +SQMP +get-lapic-tscdeadline-advance +- + +Get LAPIC tscdeadline timer advancement, in nanoseconds. + +Arguments: None. + +returns a json-object with the following information: +- value : json-int + +Example: + +- { execute: get-lapic-tscdeadline-advance } +- { return: {1000} } + +EQMP + Index: qemu.tscdeadline/vl.c === --- qemu.tscdeadline.orig/vl.c +++ qemu.tscdeadline/vl.c @@ -387,6 +387,10 @@ static QemuOptsList qemu_machine_opts = .name = iommu, .type = QEMU_OPT_BOOL, .help = Set on/off to enable/disable Intel IOMMU (VT-d), +},{ +.name = lapic-tscdeadline-advance, +.type = QEMU_OPT_NUMBER, +.help = Set lapic tscdeadline timer advance, }, { /* End of list */ } }, Index: qemu.tscdeadline/target-i386/kvm.c === --- qemu.tscdeadline.orig/target-i386/kvm.c +++ qemu.tscdeadline/target-i386/kvm.c @@ -37,6 +37,7 @@ #include hw/pci/pci.h #include migration/migration.h #include qapi/qmp/qerror.h +#include qmp-commands.h //#define DEBUG_KVM @@ -84,6 +85,10 @@ static bool has_msr_mtrr; static bool has_msr_architectural_pmu; static uint32_t num_architectural_pmu_counters; +static struct lapic_tscdeadline_advance { +unsigned int advance_ns; +} lapic_tscdeadline_advance; + bool kvm_allows_irq0_override(void) { return !kvm_irqchip_in_kernel() || kvm_has_gsi_routing(); @@ -835,12 +840,32 @@ static int kvm_get_supported_msrs(KVMSta return ret; } +static int kvm_set_lapic_tscdeadline(KVMState *s, uint32_t advance) +{ +struct kvm_tscdeadline_advance adv; +int ret = 0; + +memset(adv, 0, sizeof(adv)); + +adv.timer_advance = advance; + +ret = kvm_vm_ioctl(s, KVM_SET_TSCDEADLINE_ADVANCE, adv); +if (ret 0) { +return ret; +} + +lapic_tscdeadline_advance.advance_ns = advance; + +return ret; +} + int kvm_arch_init(KVMState *s) { uint64_t identity_base = 0xfffbc000; uint64_t shadow_mem; int ret; struct utsname utsname; +uint32_t lapic_advance_ns; ret = kvm_get_supported_msrs(s); if (ret 0) { @@ -894,9 +919,40 @@ int kvm_arch_init(KVMState *s) return ret; } } + +lapic_advance_ns = qemu_opt_get_number(qemu_get_machine_opts(), + lapic-tscdeadline-advance, + 0); +if (lapic_advance_ns) { +ret = kvm_set_lapic_tscdeadline(s, lapic_advance_ns); +if (ret) { +fprintf(stderr, Set tscdeadline
[QEMU patch 0/2] QEMU lapic tsc deadline advancement
Add command to set TSC deadline timer advancement. This value will be subtracted from the expiration time of the high resolution timer which emulates TSC deadline timer. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Intel-gfx] [ANNOUNCE][RFC] KVMGT - the implementation of Intel GVT-g(full GPU virtualization) for KVM
On 09/12/2014 03:49, Tian, Kevin wrote: - Now we have XenGT/KVMGT separately maintained, and KVMGT lags behind XenGT regarding to features and qualities. Likely you'll continue see stale code (like Xen inst decoder) for some time. In the future we plan to maintain a single kernel repo for both, so KVMGT can share same quality as XenGT once KVM in-kernel dm framework is stable. - Regarding to Qemu hacks, KVMGT really doesn't have any different requirements as what have been discussed for GPU pass-through, e.g. about ISA bridge. Our implementation is based on an old Qemu repo, and honestly speaking not cleanly developed, because we know we can leverage from GPU pass-through support once it's in Qemu. At that time we'll leverage the same logic with minimal changes to hook KVMGT mgmt. APIs (e.g. create/destroy a vGPU instance). So we can ignore this area for now. :-) Could the virtual device model introduce new registers in order to avoid poking at the ISA bridge? I'm not sure that you can leverage from GPU pass-through support once it's in Qemu, since the Xen IGD passthrough support is being added to a separate machine that is specific to Xen IGD passthrough; no ISA bridge hacking will probably be allowed on the -M pc and -M q35 machine types. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 1/2] KVM: x86: add method to test PIR bitmap vector
kvm_x86_ops-test_posted_interrupt() returns true/false depending whether 'vector' is set. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com Index: kvm/arch/x86/include/asm/kvm_host.h === --- kvm.orig/arch/x86/include/asm/kvm_host.h +++ kvm/arch/x86/include/asm/kvm_host.h @@ -743,6 +743,7 @@ struct kvm_x86_ops { void (*set_virtual_x2apic_mode)(struct kvm_vcpu *vcpu, bool set); void (*set_apic_access_page_addr)(struct kvm_vcpu *vcpu, hpa_t hpa); void (*deliver_posted_interrupt)(struct kvm_vcpu *vcpu, int vector); + bool (*test_posted_interrupt)(struct kvm_vcpu *vcpu, int vector); void (*sync_pir_to_irr)(struct kvm_vcpu *vcpu); int (*set_tss_addr)(struct kvm *kvm, unsigned int addr); int (*get_tdp_level)(void); Index: kvm/arch/x86/kvm/vmx.c === --- kvm.orig/arch/x86/kvm/vmx.c +++ kvm/arch/x86/kvm/vmx.c @@ -435,6 +435,11 @@ static int pi_test_and_set_pir(int vecto return test_and_set_bit(vector, (unsigned long *)pi_desc-pir); } +static int pi_test_pir(int vector, struct pi_desc *pi_desc) +{ + return test_bit(vector, (unsigned long *)pi_desc-pir); +} + struct vcpu_vmx { struct kvm_vcpu vcpu; unsigned long host_rsp; @@ -5939,6 +5944,7 @@ static __init int hardware_setup(void) else { kvm_x86_ops-hwapic_irr_update = NULL; kvm_x86_ops-deliver_posted_interrupt = NULL; + kvm_x86_ops-test_posted_interrupt = NULL; kvm_x86_ops-sync_pir_to_irr = vmx_sync_pir_to_irr_dummy; } @@ -6960,6 +6966,13 @@ static int handle_invvpid(struct kvm_vcp return 1; } +static bool vmx_test_pir(struct kvm_vcpu *vcpu, int vector) +{ + struct vcpu_vmx *vmx = to_vmx(vcpu); + + return pi_test_pir(vector, vmx-pi_desc); +} + /* * The exit handlers return 1 if the exit was handled fully and guest execution * may resume. Otherwise they set the kvm_run parameter to indicate what needs @@ -9374,6 +9387,7 @@ static struct kvm_x86_ops vmx_x86_ops = .hwapic_isr_update = vmx_hwapic_isr_update, .sync_pir_to_irr = vmx_sync_pir_to_irr, .deliver_posted_interrupt = vmx_deliver_posted_interrupt, + .test_posted_interrupt = vmx_test_pir, .set_tss_addr = vmx_set_tss_addr, .get_tdp_level = get_ept_level, -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 0/2] KVM: add option to advance tscdeadline hrtimer expiration
See patches for details. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [QEMU patch 2/2] kvm: allow configuration of tsc deadline timer advancement
On 10/12/2014 17:23, Marcelo Tosatti wrote: Add machine option and QMP commands to configure TSC deadline timer advancement. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com --- monitor.c | 15 ++ qapi-schema.json | 29 +++ qmp-commands.hx | 48 target-i386/kvm.c | 80 ++ vl.c |4 ++ 5 files changed, 176 insertions(+) Index: qemu.tscdeadline/qapi-schema.json === --- qemu.tscdeadline.orig/qapi-schema.json +++ qemu.tscdeadline/qapi-schema.json @@ -3515,3 +3515,32 @@ # Since: 2.1 ## { 'command': 'rtc-reset-reinjection' } + +## +# @set-lapic-tscdeadline-advance +# +# This command sets the TSC deadline timer advancement. +# This value will be subtracted from the expiration time +# of the high resolution timer which emulates +# TSC deadline timer. +# +# Useful to achieve low timer latencies. +# +# Only supported by KVM acceleration. +# +# Since: 2.3 +## +{ 'command': 'set-lapic-tscdeadline-advance', + 'data': { 'advance':'int' } +} + +## +# @get-lapic-tscdeadline-advance +# +# This command gets the TSC deadline timer advancement. +# +# Only supported by KVM acceleration. +# +# Since: 2.3 +## +{ 'command': 'get-lapic-tscdeadline-advance', 'returns': 'int' } Please add an object property to the x86 CPU object. It can then be configured with -global on the command line. +ret = kvm_vm_ioctl(s, KVM_SET_TSCDEADLINE_ADVANCE, adv); +if (ret 0) { +return ret; +} Please use KVM_GET/SET_ONE_REG instead of introducing a new set of ioctls. Paolo +lapic_tscdeadline_advance.advance_ns = advance; + +return ret; +} + int kvm_arch_init(KVMState *s) { uint64_t identity_base = 0xfffbc000; uint64_t shadow_mem; int ret; struct utsname utsname; +uint32_t lapic_advance_ns; ret = kvm_get_supported_msrs(s); if (ret 0) { @@ -894,9 +919,40 @@ int kvm_arch_init(KVMState *s) return ret; } } + +lapic_advance_ns = qemu_opt_get_number(qemu_get_machine_opts(), + lapic-tscdeadline-advance, + 0); +if (lapic_advance_ns) { +ret = kvm_set_lapic_tscdeadline(s, lapic_advance_ns); +if (ret) { +fprintf(stderr, Set tscdeadline advance failed: %s\n, +strerror(-ret)); +return ret; +} +} + + return 0; } +int64_t qmp_get_lapic_tscdeadline_advance(Error **errp) +{ +return lapic_tscdeadline_advance.advance_ns; +} + +void qmp_set_lapic_tscdeadline_advance(int64_t advance, Error **errp) +{ +KVMState *s = kvm_state; +int ret; + +ret = kvm_set_lapic_tscdeadline(s, advance); +if (ret) { +error_setg_errno(errp, ret, set lapic tscdeadline failed); +return; +} +} + static void set_v8086_seg(struct kvm_segment *lhs, const SegmentCache *rhs) { lhs-selector = rhs-selector; Index: qemu.tscdeadline/monitor.c === --- qemu.tscdeadline.orig/monitor.c +++ qemu.tscdeadline/monitor.c @@ -5447,3 +5447,18 @@ void qmp_rtc_reset_reinjection(Error **e error_set(errp, QERR_FEATURE_DISABLED, rtc-reset-reinjection); } #endif + +#if !defined (TARGET_I386) || !defined (CONFIG_KVM) +int64_t qmp_get_lapic_tscdeadline_advance(Error **errp) +{ +error_set(errp, QERR_FEATURE_DISABLED, get-lapic-tscdeadline-advance); + +return 0; +} + +void qmp_set_lapic_tscdeadline_advance(int64_t advance, Error **errp) +{ +error_set(errp, QERR_FEATURE_DISABLED, set-lapic-tscdeadline-advance); +} +#endif + Index: qemu.tscdeadline/include/hw/boards.h === --- qemu.tscdeadline.orig/include/hw/boards.h +++ qemu.tscdeadline/include/hw/boards.h @@ -133,6 +133,7 @@ struct MachineState { bool usb; char *firmware; bool iommu; +int lapi_tscdeadline_advance; ram_addr_t ram_size; ram_addr_t maxram_size; Index: qemu.tscdeadline/qemu-options.hx === --- qemu.tscdeadline.orig/qemu-options.hx +++ qemu.tscdeadline/qemu-options.hx @@ -37,7 +37,8 @@ DEF(machine, HAS_ARG, QEMU_OPTION_mach kvm_shadow_mem=size of KVM shadow MMU\n dump-guest-core=on|off include guest memory in a core dump (default=on)\n mem-merge=on|off controls memory merge support (default: on)\n -iommu=on|off controls emulated Intel IOMMU (VT-d) support (default=off)\n, +iommu=on|off controls emulated Intel IOMMU (VT-d) support
[patch 0/2] KVM: add option to advance tscdeadline hrtimer expiration (v2)
See patches for details. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 2/2] KVM: x86: add option to advance tscdeadline hrtimer expiration
For the hrtimer which emulates the tscdeadline timer in the guest, add an option to advance expiration, and busy spin on VM-entry waiting for the actual expiration time to elapse. This allows achieving low latencies in cyclictest (or any scenario which requires strict timing regarding timer expiration). Reduces cyclictest avg latency by 50%. Note: this option requires tuning to find the appropriate value for a particular hardware/guest combination. One method is to measure the average delay between apic_timer_fn and VM-entry. Another method is to start with 1000ns, and increase the value in say 500ns increments until avg cyclictest numbers stop decreasing. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com Index: kvm/arch/x86/kvm/lapic.c === --- kvm.orig/arch/x86/kvm/lapic.c +++ kvm/arch/x86/kvm/lapic.c @@ -33,6 +33,7 @@ #include asm/page.h #include asm/current.h #include asm/apicdef.h +#include asm/delay.h #include linux/atomic.h #include linux/jump_label.h #include kvm_cache_regs.h @@ -1073,6 +1074,7 @@ static void apic_timer_expired(struct kv { struct kvm_vcpu *vcpu = apic-vcpu; wait_queue_head_t *q = vcpu-wq; + struct kvm_timer *ktimer = apic-lapic_timer; /* * Note: KVM_REQ_PENDING_TIMER is implicitly checked in @@ -1087,11 +1089,59 @@ static void apic_timer_expired(struct kv if (waitqueue_active(q)) wake_up_interruptible(q); + + if (ktimer-timer_mode_mask == APIC_LVT_TIMER_TSCDEADLINE) + ktimer-expired_tscdeadline = ktimer-tscdeadline; +} + +static bool lapic_timer_int_injected(struct kvm_vcpu *vcpu) +{ + struct kvm_lapic *apic = vcpu-arch.apic; + u32 reg = kvm_apic_get_reg(apic, APIC_LVTT); + + if (kvm_apic_hw_enabled(apic)) { + int vec = reg APIC_VECTOR_MASK; + + if (kvm_x86_ops-test_posted_interrupt) + return kvm_x86_ops-test_posted_interrupt(vcpu, vec); + else { + if (apic_test_vector(vec, apic-regs + APIC_ISR)) + return true; + } + } + return false; +} + +void wait_lapic_expire(struct kvm_vcpu *vcpu) +{ + struct kvm_lapic *apic = vcpu-arch.apic; + u64 guest_tsc, tsc_deadline; + + if (!kvm_vcpu_has_lapic(vcpu)) + return; + + if (!apic_lvtt_tscdeadline(apic)) + return; + + if (!lapic_timer_int_injected(vcpu)) + return; + + tsc_deadline = apic-lapic_timer.expired_tscdeadline; + guest_tsc = kvm_x86_ops-read_l1_tsc(vcpu, native_read_tsc()); + + while (guest_tsc tsc_deadline) { + int delay = min(tsc_deadline - guest_tsc, 1000ULL); + + ndelay(delay); + guest_tsc = kvm_x86_ops-read_l1_tsc(vcpu, native_read_tsc()); + } } static void start_apic_timer(struct kvm_lapic *apic) { ktime_t now; + struct kvm_arch *kvm_arch = apic-vcpu-kvm-arch; + atomic_set(apic-lapic_timer.pending, 0); if (apic_lvtt_period(apic) || apic_lvtt_oneshot(apic)) { @@ -1137,6 +1187,7 @@ static void start_apic_timer(struct kvm_ /* lapic timer in tsc deadline mode */ u64 guest_tsc, tscdeadline = apic-lapic_timer.tscdeadline; u64 ns = 0; + ktime_t expire; struct kvm_vcpu *vcpu = apic-vcpu; unsigned long this_tsc_khz = vcpu-arch.virtual_tsc_khz; unsigned long flags; @@ -1149,10 +1200,14 @@ static void start_apic_timer(struct kvm_ now = apic-lapic_timer.timer.base-get_time(); guest_tsc = kvm_x86_ops-read_l1_tsc(vcpu, native_read_tsc()); if (likely(tscdeadline guest_tsc)) { + u32 advance = kvm_arch-lapic_tscdeadline_advance_ns; + ns = (tscdeadline - guest_tsc) * 100ULL; do_div(ns, this_tsc_khz); + expire = ktime_add_ns(now, ns); + expire = ktime_sub_ns(expire, advance); hrtimer_start(apic-lapic_timer.timer, - ktime_add_ns(now, ns), HRTIMER_MODE_ABS); + expire, HRTIMER_MODE_ABS); } else apic_timer_expired(apic); Index: kvm/arch/x86/kvm/lapic.h === --- kvm.orig/arch/x86/kvm/lapic.h +++ kvm/arch/x86/kvm/lapic.h @@ -14,6 +14,7 @@ struct kvm_timer { u32 timer_mode; u32 timer_mode_mask; u64 tscdeadline; + u64 expired_tscdeadline; atomic_t pending; /* accumulated triggered timers */ }; @@ -170,4 +171,6 @@ static inline bool kvm_apic_has_events(s bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector); +void
Re: [patch 2/2] KVM: x86: add option to advance tscdeadline hrtimer expiration
On 12/10/2014 12:06 PM, Marcelo Tosatti wrote: For the hrtimer which emulates the tscdeadline timer in the guest, add an option to advance expiration, and busy spin on VM-entry waiting for the actual expiration time to elapse. This allows achieving low latencies in cyclictest (or any scenario which requires strict timing regarding timer expiration). Reduces cyclictest avg latency by 50%. Note: this option requires tuning to find the appropriate value for a particular hardware/guest combination. One method is to measure the average delay between apic_timer_fn and VM-entry. Another method is to start with 1000ns, and increase the value in say 500ns increments until avg cyclictest numbers stop decreasing. It would be good to document how this is used, in the changelog. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 1/2] KVM: x86: add method to test PIR bitmap vector
On 12/10/2014 12:06 PM, Marcelo Tosatti wrote: kvm_x86_ops-test_posted_interrupt() returns true/false depending whether 'vector' is set. Is that good? Bad? How does this patch address the issue? -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 1/2] KVM: x86: add method to test PIR bitmap vector
On Wed, Dec 10, 2014 at 12:10:04PM -0500, Rik van Riel wrote: On 12/10/2014 12:06 PM, Marcelo Tosatti wrote: kvm_x86_ops-test_posted_interrupt() returns true/false depending whether 'vector' is set. Is that good? Bad? How does this patch address the issue? What issue? -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [QEMU patch 2/2] kvm: allow configuration of tsc deadline timer advancement
On Wed, Dec 10, 2014 at 06:09:19PM +0100, Paolo Bonzini wrote: On 10/12/2014 18:04, Marcelo Tosatti wrote: Please add an object property to the x86 CPU object. It can then be configured with -global on the command line. Don't want to allow individual values for different CPUs. It is a per-VM property. Why? It can cause busy waiting, it would make sense to make it stricter for realtime CPUs and leave 0 for non-realtime CPUs. Paolo HW timer behaviour should be consistent across CPUs, IMO. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/2] KVM: add option to advance tscdeadline hrtimer expiration (v2)
On Wed, Dec 10, 2014 at 06:10:19PM +0100, Paolo Bonzini wrote: On 10/12/2014 18:06, Marcelo Tosatti wrote: See patches for details. Difference between v1 and v2? Please fix your workflow. Paolo Wrong sender email address, that is all. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[Bug 87611] Duplicate interrupt abbreviation [patch proposal included]
https://bugzilla.kernel.org/show_bug.cgi?id=87611 Alan a...@lxorguk.ukuu.org.uk changed: What|Removed |Added CC||a...@lxorguk.ukuu.org.uk Component|x86-64 |kvm Assignee|platform_x86_64@kernel-bugs |virtualization_kvm@kernel-b |.osdl.org |ugs.osdl.org Product|Platform Specific/Hardware |Virtualization -- You are receiving this mail because: You are watching the assignee of the bug. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [QEMU patch 2/2] kvm: allow configuration of tsc deadline timer advancement
On Wed, Dec 10, 2014 at 06:01:21PM +0100, Paolo Bonzini wrote: On 10/12/2014 17:23, Marcelo Tosatti wrote: Add machine option and QMP commands to configure TSC deadline timer advancement. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com --- monitor.c | 15 ++ qapi-schema.json | 29 +++ qmp-commands.hx | 48 target-i386/kvm.c | 80 ++ vl.c |4 ++ 5 files changed, 176 insertions(+) Index: qemu.tscdeadline/qapi-schema.json === --- qemu.tscdeadline.orig/qapi-schema.json +++ qemu.tscdeadline/qapi-schema.json @@ -3515,3 +3515,32 @@ # Since: 2.1 ## { 'command': 'rtc-reset-reinjection' } + +## +# @set-lapic-tscdeadline-advance +# +# This command sets the TSC deadline timer advancement. +# This value will be subtracted from the expiration time +# of the high resolution timer which emulates +# TSC deadline timer. +# +# Useful to achieve low timer latencies. +# +# Only supported by KVM acceleration. +# +# Since: 2.3 +## +{ 'command': 'set-lapic-tscdeadline-advance', + 'data': { 'advance':'int' } +} + +## +# @get-lapic-tscdeadline-advance +# +# This command gets the TSC deadline timer advancement. +# +# Only supported by KVM acceleration. +# +# Since: 2.3 +## +{ 'command': 'get-lapic-tscdeadline-advance', 'returns': 'int' } Please add an object property to the x86 CPU object. It can then be configured with -global on the command line. Don't want to allow individual values for different CPUs. It is a per-VM property. Is it still appropriate to use an object property of the CPU object? +ret = kvm_vm_ioctl(s, KVM_SET_TSCDEADLINE_ADVANCE, adv); +if (ret 0) { +return ret; +} Please use KVM_GET/SET_ONE_REG instead of introducing a new set of ioctls. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 2/2] KVM: x86: add option to advance tscdeadline hrtimer expiration
On 10/12/2014 17:53, marcelo.tosa...@amt.cnet wrote: For the hrtimer which emulates the tscdeadline timer in the guest, add an option to advance expiration, and busy spin on VM-entry waiting for the actual expiration time to elapse. This allows achieving low latencies in cyclictest (or any scenario which requires strict timing regarding timer expiration). Reduces cyclictest avg latency by 50%. Note: this option requires tuning to find the appropriate value for a particular hardware/guest combination. One method is to measure the average delay between apic_timer_fn and VM-entry. Another method is to start with 1000ns, and increase the value in say 500ns increments until avg cyclictest numbers stop decreasing. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com What is the latency value that you find in practice, for both apic_timer_fn to vmentry? Or for apic_timer_fn to just before vmrun? Let's start with a kvm-unit-tests patch to measure this value. We can then decide whether to hardcode a small default value (e.g. 1000-3000) and make it a module parameter? Or perhaps start with a higher value (twice what you find in practice?) and adjust it towards a target every time wait_lapic_expire is called. But in order to judge the correct approach, I need to see the numbers. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [QEMU patch 2/2] kvm: allow configuration of tsc deadline timer advancement
On 10/12/2014 18:04, Marcelo Tosatti wrote: Please add an object property to the x86 CPU object. It can then be configured with -global on the command line. Don't want to allow individual values for different CPUs. It is a per-VM property. Why? It can cause busy waiting, it would make sense to make it stricter for realtime CPUs and leave 0 for non-realtime CPUs. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [QEMU patch 2/2] kvm: allow configuration of tsc deadline timer advancement
On 10/12/2014 18:27, Marcelo Tosatti wrote: On Wed, Dec 10, 2014 at 06:09:19PM +0100, Paolo Bonzini wrote: On 10/12/2014 18:04, Marcelo Tosatti wrote: Please add an object property to the x86 CPU object. It can then be configured with -global on the command line. Don't want to allow individual values for different CPUs. It is a per-VM property. Why? It can cause busy waiting, it would make sense to make it stricter for realtime CPUs and leave 0 for non-realtime CPUs. Paolo HW timer behaviour should be consistent across CPUs, IMO. It's not going to be anyway. Cache line bounces, frequency scaling, presence of higher-priority RT tasks, etc. can cause different response for one CPU over the others. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 2/2] KVM: x86: add option to advance tscdeadline hrtimer expiration
On Wed, Dec 10, 2014 at 06:08:14PM +0100, Paolo Bonzini wrote: On 10/12/2014 17:53, marcelo.tosa...@amt.cnet wrote: For the hrtimer which emulates the tscdeadline timer in the guest, add an option to advance expiration, and busy spin on VM-entry waiting for the actual expiration time to elapse. This allows achieving low latencies in cyclictest (or any scenario which requires strict timing regarding timer expiration). Reduces cyclictest avg latency by 50%. Note: this option requires tuning to find the appropriate value for a particular hardware/guest combination. One method is to measure the average delay between apic_timer_fn and VM-entry. Another method is to start with 1000ns, and increase the value in say 500ns increments until avg cyclictest numbers stop decreasing. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com What is the latency value that you find in practice, for both apic_timer_fn to vmentry? Or for apic_timer_fn to just before vmrun? 7us between apic_timer_fn and kvm_entry tracepoint. Let's start with a kvm-unit-tests patch to measure this value. I can, but kvm-unit-test register state will not be similar to actual guest state (think host/guest state loading). What is the advantage of using a kvm-unit-test test rather than cyclictest in the guest ? We can then decide whether to hardcode a small default value (e.g. 1000-3000) and make it a module parameter? Or perhaps start with a higher value (twice what you find in practice?) and adjust it towards a target every time wait_lapic_expire is called. But in order to judge the correct approach, I need to see the numbers. Problem with automatic adjustment is: what is the correct target? You want faster instances of apic_timer_fn-vm-entry to spin a bit, and allow slow instances of apic_timer_fn-vm-entry to have an effective advancement. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH RFC v5 05/19] virtio: support more feature bits
On Tue, 2 Dec 2014 14:00:13 +0100 Cornelia Huck cornelia.h...@de.ibm.com wrote: diff --git a/include/hw/qdev-properties.h b/include/hw/qdev-properties.h index 070006c..23d713b 100644 --- a/include/hw/qdev-properties.h +++ b/include/hw/qdev-properties.h @@ -51,6 +51,17 @@ extern PropertyInfo qdev_prop_arraylen; .defval= (bool)_defval, \ } +#define DEFINE_PROP_BIT64(_name, _state, _field, _bit, _defval) { \ +.name = (_name),\ +.info = (qdev_prop_bit), \ +.bitnr= (_bit), \ +.offset= offsetof(_state, _field)\ ++ type_check(uint64_t,typeof_field(_state, _field)), \ +.qtype = QTYPE_QBOOL,\ +.defval= (bool)_defval, \ +} + + #define DEFINE_PROP_BOOL(_name, _state, _field, _defval) { \ .name = (_name),\ .info = (qdev_prop_bool), \ This one is of course broken. I'll send an updated patch tomorrow. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH RFC v5 18/19] virtio: support revision-specific features
On Tue, 2 Dec 2014 14:00:26 +0100 Cornelia Huck cornelia.h...@de.ibm.com wrote: Devices may support different sets of feature bits depending on which revision they're operating at. Let's give the transport a way to re-query the device about its features when the revision has been changed. Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- hw/s390x/virtio-ccw.c | 12 ++-- hw/virtio/virtio-bus.c | 14 -- include/hw/virtio/virtio-bus.h |3 +++ include/hw/virtio/virtio.h |3 +++ 4 files changed, 28 insertions(+), 4 deletions(-) There seems to be something wrong with this patch - I noticed when I fixed prop_bit64. Needs debugging. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[Bug 87611] Duplicate interrupt abbreviation [patch proposal included]
https://bugzilla.kernel.org/show_bug.cgi?id=87611 --- Comment #1 from Antti Tönkyrä daeda...@pingtimeout.net --- I also posted the patch to LKML but I think no-one picked it up and I didn't have time to follow up on this after that. ( https://lkml.org/lkml/2014/11/9/57 ) -- You are receiving this mail because: You are watching the assignee of the bug.-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [QEMU patch 2/2] kvm: allow configuration of tsc deadline timer advancement
On 12/10/2014 09:23 AM, Marcelo Tosatti wrote: Add machine option and QMP commands to configure TSC deadline timer advancement. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com --- +## +# @get-lapic-tscdeadline-advance +# +# This command gets the TSC deadline timer advancement. +# +# Only supported by KVM acceleration. +# +# Since: 2.3 +## +{ 'command': 'get-lapic-tscdeadline-advance', 'returns': 'int' } Please don't return a bare int. It is not extensible, if we ever need multiple named values associated with lapic in the future. Return a dictionary instead. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [patch 0/2] KVM: add option to advance tscdeadline hrtimer expiration (v2)
On 10/12/2014 18:06, Marcelo Tosatti wrote: See patches for details. Difference between v1 and v2? Please fix your workflow. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 2/2] KVM: x86: add option to advance tscdeadline hrtimer expiration
For the hrtimer which emulates the tscdeadline timer in the guest, add an option to advance expiration, and busy spin on VM-entry waiting for the actual expiration time to elapse. This allows achieving low latencies in cyclictest (or any scenario which requires strict timing regarding timer expiration). Reduces cyclictest avg latency by 50%. Note: this option requires tuning to find the appropriate value for a particular hardware/guest combination. One method is to measure the average delay between apic_timer_fn and VM-entry. Another method is to start with 1000ns, and increase the value in say 500ns increments until avg cyclictest numbers stop decreasing. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com Index: kvm/arch/x86/kvm/lapic.c === --- kvm.orig/arch/x86/kvm/lapic.c +++ kvm/arch/x86/kvm/lapic.c @@ -33,6 +33,7 @@ #include asm/page.h #include asm/current.h #include asm/apicdef.h +#include asm/delay.h #include linux/atomic.h #include linux/jump_label.h #include kvm_cache_regs.h @@ -1073,6 +1074,7 @@ static void apic_timer_expired(struct kv { struct kvm_vcpu *vcpu = apic-vcpu; wait_queue_head_t *q = vcpu-wq; + struct kvm_timer *ktimer = apic-lapic_timer; /* * Note: KVM_REQ_PENDING_TIMER is implicitly checked in @@ -1087,11 +1089,59 @@ static void apic_timer_expired(struct kv if (waitqueue_active(q)) wake_up_interruptible(q); + + if (ktimer-timer_mode_mask == APIC_LVT_TIMER_TSCDEADLINE) + ktimer-expired_tscdeadline = ktimer-tscdeadline; +} + +static bool lapic_timer_int_injected(struct kvm_vcpu *vcpu) +{ + struct kvm_lapic *apic = vcpu-arch.apic; + u32 reg = kvm_apic_get_reg(apic, APIC_LVTT); + + if (kvm_apic_hw_enabled(apic)) { + int vec = reg APIC_VECTOR_MASK; + + if (kvm_x86_ops-test_posted_interrupt) + return kvm_x86_ops-test_posted_interrupt(vcpu, vec); + else { + if (apic_test_vector(vec, apic-regs + APIC_ISR)) + return true; + } + } + return false; +} + +void wait_lapic_expire(struct kvm_vcpu *vcpu) +{ + struct kvm_lapic *apic = vcpu-arch.apic; + u64 guest_tsc, tsc_deadline; + + if (!kvm_vcpu_has_lapic(vcpu)) + return; + + if (!apic_lvtt_tscdeadline(apic)) + return; + + if (!lapic_timer_int_injected(vcpu)) + return; + + tsc_deadline = apic-lapic_timer.expired_tscdeadline; + guest_tsc = kvm_x86_ops-read_l1_tsc(vcpu, native_read_tsc()); + + while (guest_tsc tsc_deadline) { + int delay = min(tsc_deadline - guest_tsc, 1000ULL); + + ndelay(delay); + guest_tsc = kvm_x86_ops-read_l1_tsc(vcpu, native_read_tsc()); + } } static void start_apic_timer(struct kvm_lapic *apic) { ktime_t now; + struct kvm_arch *kvm_arch = apic-vcpu-kvm-arch; + atomic_set(apic-lapic_timer.pending, 0); if (apic_lvtt_period(apic) || apic_lvtt_oneshot(apic)) { @@ -1137,6 +1187,7 @@ static void start_apic_timer(struct kvm_ /* lapic timer in tsc deadline mode */ u64 guest_tsc, tscdeadline = apic-lapic_timer.tscdeadline; u64 ns = 0; + ktime_t expire; struct kvm_vcpu *vcpu = apic-vcpu; unsigned long this_tsc_khz = vcpu-arch.virtual_tsc_khz; unsigned long flags; @@ -1149,10 +1200,14 @@ static void start_apic_timer(struct kvm_ now = apic-lapic_timer.timer.base-get_time(); guest_tsc = kvm_x86_ops-read_l1_tsc(vcpu, native_read_tsc()); if (likely(tscdeadline guest_tsc)) { + u32 advance = kvm_arch-lapic_tscdeadline_advance_ns; + ns = (tscdeadline - guest_tsc) * 100ULL; do_div(ns, this_tsc_khz); + expire = ktime_add_ns(now, ns); + expire = ktime_sub_ns(expire, advance); hrtimer_start(apic-lapic_timer.timer, - ktime_add_ns(now, ns), HRTIMER_MODE_ABS); + expire, HRTIMER_MODE_ABS); } else apic_timer_expired(apic); Index: kvm/arch/x86/kvm/lapic.h === --- kvm.orig/arch/x86/kvm/lapic.h +++ kvm/arch/x86/kvm/lapic.h @@ -14,6 +14,7 @@ struct kvm_timer { u32 timer_mode; u32 timer_mode_mask; u64 tscdeadline; + u64 expired_tscdeadline; atomic_t pending; /* accumulated triggered timers */ }; @@ -170,4 +171,6 @@ static inline bool kvm_apic_has_events(s bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector); +void
[patch 1/2] KVM: x86: add method to test PIR bitmap vector
kvm_x86_ops-test_posted_interrupt() returns true/false depending whether 'vector' is set. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com Index: kvm/arch/x86/include/asm/kvm_host.h === --- kvm.orig/arch/x86/include/asm/kvm_host.h +++ kvm/arch/x86/include/asm/kvm_host.h @@ -743,6 +743,7 @@ struct kvm_x86_ops { void (*set_virtual_x2apic_mode)(struct kvm_vcpu *vcpu, bool set); void (*set_apic_access_page_addr)(struct kvm_vcpu *vcpu, hpa_t hpa); void (*deliver_posted_interrupt)(struct kvm_vcpu *vcpu, int vector); + bool (*test_posted_interrupt)(struct kvm_vcpu *vcpu, int vector); void (*sync_pir_to_irr)(struct kvm_vcpu *vcpu); int (*set_tss_addr)(struct kvm *kvm, unsigned int addr); int (*get_tdp_level)(void); Index: kvm/arch/x86/kvm/vmx.c === --- kvm.orig/arch/x86/kvm/vmx.c +++ kvm/arch/x86/kvm/vmx.c @@ -435,6 +435,11 @@ static int pi_test_and_set_pir(int vecto return test_and_set_bit(vector, (unsigned long *)pi_desc-pir); } +static int pi_test_pir(int vector, struct pi_desc *pi_desc) +{ + return test_bit(vector, (unsigned long *)pi_desc-pir); +} + struct vcpu_vmx { struct kvm_vcpu vcpu; unsigned long host_rsp; @@ -5939,6 +5944,7 @@ static __init int hardware_setup(void) else { kvm_x86_ops-hwapic_irr_update = NULL; kvm_x86_ops-deliver_posted_interrupt = NULL; + kvm_x86_ops-test_posted_interrupt = NULL; kvm_x86_ops-sync_pir_to_irr = vmx_sync_pir_to_irr_dummy; } @@ -6960,6 +6966,13 @@ static int handle_invvpid(struct kvm_vcp return 1; } +static bool vmx_test_pir(struct kvm_vcpu *vcpu, int vector) +{ + struct vcpu_vmx *vmx = to_vmx(vcpu); + + return pi_test_pir(vector, vmx-pi_desc); +} + /* * The exit handlers return 1 if the exit was handled fully and guest execution * may resume. Otherwise they set the kvm_run parameter to indicate what needs @@ -9374,6 +9387,7 @@ static struct kvm_x86_ops vmx_x86_ops = .hwapic_isr_update = vmx_hwapic_isr_update, .sync_pir_to_irr = vmx_sync_pir_to_irr, .deliver_posted_interrupt = vmx_deliver_posted_interrupt, + .test_posted_interrupt = vmx_test_pir, .set_tss_addr = vmx_set_tss_addr, .get_tdp_level = get_ept_level, -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 1/2] KVM: x86: add method to test PIR bitmap vector
On 12/10/2014 12:27 PM, Marcelo Tosatti wrote: On Wed, Dec 10, 2014 at 12:10:04PM -0500, Rik van Riel wrote: On 12/10/2014 12:06 PM, Marcelo Tosatti wrote: kvm_x86_ops-test_posted_interrupt() returns true/false depending whether 'vector' is set. Is that good? Bad? How does this patch address the issue? What issue? Why is this change being made? -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 2/2] KVM: x86: add option to advance tscdeadline hrtimer expiration
On 10/12/2014 18:34, Marcelo Tosatti wrote: Let's start with a kvm-unit-tests patch to measure this value. I can, but kvm-unit-test register state will not be similar to actual guest state (think host/guest state loading). 7us is about 2 clock cycles. A lightweight vmexit is an order of magnitude less expensive, and half of the vmexit overhead is the VMRUN instruction itself. All in all, the host/guest state loading should not matter (or should matter little). What is the advantage of using a kvm-unit-test test rather than cyclictest in the guest ? That it starts in 3 seconds, and that you can vary the timer frequency in order to measure jitter in addition to latency. We can then decide whether to hardcode a small default value (e.g. 1000-3000) and make it a module parameter? Or perhaps start with a higher value (twice what you find in practice?) and adjust it towards a target every time wait_lapic_expire is called. But in order to judge the correct approach, I need to see the numbers. Problem with automatic adjustment is: what is the correct target? We cannot say without seeing the numbers, particularly the jitter. This is why I want to see numbers for varying frequencies (from 100us to 10ms per tick, say). You want faster instances of apic_timer_fn-vm-entry to spin a bit, and allow slow instances of apic_timer_fn-vm-entry to have an effective advancement. If it is small enogh, you can make the timer a little early (increase advance) by a small amount on every delivered interrupt. This prepares for a slow instance. And you can make the timer less early (decrease advance) by some percentage of what you had to wait on every wait_lapic_expire, if you had to wait more than a given threshold. This avoids that you wait too much on consecutive fast instances. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [QEMU patch 2/2] kvm: allow configuration of tsc deadline timer advancement
On 10/12/2014 18:35, Marcelo Tosatti wrote: On Wed, Dec 10, 2014 at 06:29:43PM +0100, Paolo Bonzini wrote: On 10/12/2014 18:27, Marcelo Tosatti wrote: On Wed, Dec 10, 2014 at 06:09:19PM +0100, Paolo Bonzini wrote: On 10/12/2014 18:04, Marcelo Tosatti wrote: Please add an object property to the x86 CPU object. It can then be configured with -global on the command line. Don't want to allow individual values for different CPUs. It is a per-VM property. Why? It can cause busy waiting, it would make sense to make it stricter for realtime CPUs and leave 0 for non-realtime CPUs. HW timer behaviour should be consistent across CPUs, IMO. It's not going to be anyway. Cache line bounces, frequency scaling, presence of higher-priority RT tasks, etc. can cause different response for one CPU over the others. OK i'll change it to per-CPU. Well, my preferred choice would be automatic adjustment with a module parameter. If we need manual tuning, per-CPU would be my choice, but automatic is nicer anyway. :) Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 1/2] KVM: x86: add method to test PIR bitmap vector
On Wed, Dec 10, 2014 at 12:50:29PM -0500, Rik van Riel wrote: On 12/10/2014 12:27 PM, Marcelo Tosatti wrote: On Wed, Dec 10, 2014 at 12:10:04PM -0500, Rik van Riel wrote: On 12/10/2014 12:06 PM, Marcelo Tosatti wrote: kvm_x86_ops-test_posted_interrupt() returns true/false depending whether 'vector' is set. Is that good? Bad? How does this patch address the issue? What issue? Why is this change being made? Next patch in the series. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [QEMU patch 2/2] kvm: allow configuration of tsc deadline timer advancement
On Wed, Dec 10, 2014 at 06:29:43PM +0100, Paolo Bonzini wrote: On 10/12/2014 18:27, Marcelo Tosatti wrote: On Wed, Dec 10, 2014 at 06:09:19PM +0100, Paolo Bonzini wrote: On 10/12/2014 18:04, Marcelo Tosatti wrote: Please add an object property to the x86 CPU object. It can then be configured with -global on the command line. Don't want to allow individual values for different CPUs. It is a per-VM property. Why? It can cause busy waiting, it would make sense to make it stricter for realtime CPUs and leave 0 for non-realtime CPUs. Paolo HW timer behaviour should be consistent across CPUs, IMO. It's not going to be anyway. Cache line bounces, frequency scaling, presence of higher-priority RT tasks, etc. can cause different response for one CPU over the others. Paolo OK i'll change it to per-CPU. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [QEMU patch 2/2] kvm: allow configuration of tsc deadline timer advancement
2014-12-10 18:55+0100, Paolo Bonzini: Well, my preferred choice would be automatic adjustment with a module parameter. If we need manual tuning, per-CPU would be my choice, but automatic is nicer anyway. :) I agree with Paolo, and think it would be better not to touch QEMU ... it makes little sense to migrate this value and it is probably going to be quite similar on every CPU, so a writeable module parameter is a better starting point. (We can always turn it into a nightmare later.) If you measure the difference between the TSC you wanted and got on VM entry, you can use it to automatically guess a delta for the next timer. (That is IMO exactly what you would do with a manual tuning. The algorithm should probably prefer being a bit late than early too.) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [QEMU patch 2/2] kvm: allow configuration of tsc deadline timer advancement
On 10/12/2014 19:39, Radim Krčmář wrote: 2014-12-10 18:55+0100, Paolo Bonzini: Well, my preferred choice would be automatic adjustment with a module parameter. If we need manual tuning, per-CPU would be my choice, but automatic is nicer anyway. :) I agree with Paolo, and think it would be better not to touch QEMU ... it makes little sense to migrate this value and it is probably going to be quite similar on every CPU, so a writeable module parameter is a better starting point. (We can always turn it into a nightmare later.) Ok, let's start with a simple module parameter, similar to what PLE used to have. We can use that to play with kvm-unit-tests. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH kvm-unit-tests 00/15] arm64: initial drop
This series adds support for aarch64 to the kvm-unit-tests framework, bringing it to the same level as the arm support. In the process a few tweaks to the arm support were made, as one of the main goals was to share as much code as possible between the two. Patches 01 : A fix for the script runner. We need this one for arm regardless of the aarch64 support. 02-03: Fixes to the arm support. The bugs fixed weren't visible until running on aarch64. 04-07: Prep the arm framework for the bare minimal initial drop 08 : The bare minimal initial drop 09 : Add vector support to the minimal drop 10-12: Prep the arm framework for enabling the mmu on aarch64 13-14: Prep the aarch64 framework for enabling the mmu 15 : Enables the mmu on aarch64 These patches are also available here https://github.com/rhdrjones/kvm-unit-tests/tree/arm64/initial-drop Thanks, drew Andrew Jones (15): arm: fix run script testdev probing virtio: don't use size_t arm: setup: fix type mismatch Makefile: cscope may need to look in lib/$ARCH arm: use absolute headers arm: setup: drop unused arguments arm: selftest: rename svc mode to kernel mode arm64: initial drop arm64: vectors support arm: get PHYS_MASK from pgtable-hwdef.h arm: import more linux page table api arm: prepare mmu code for arm64 arm64: import some Linux page table API arm64: prepare for 64k pages arm64: enable mmu Makefile | 4 +- arm/cstart.S | 18 ++- arm/cstart64.S| 252 ++ arm/flat.lds | 11 +- arm/run | 12 +- arm/selftest.c| 141 +-- arm/unittests.cfg | 12 +- config/config-arm-common.mak | 69 config/config-arm.mak | 74 ++--- config/config-arm64.mak | 21 configure | 12 +- lib/arm/asm-offsets.c | 11 +- lib/arm/asm/asm-offsets.h | 2 +- lib/arm/asm/io.h | 8 +- lib/arm/asm/mmu-api.h | 14 +++ lib/arm/asm/mmu.h | 27 ++--- lib/arm/asm/page.h| 7 +- lib/arm/asm/pgtable-hwdef.h | 44 +++- lib/arm/asm/pgtable.h | 91 +++ lib/arm/asm/processor.h | 2 +- lib/arm/asm/ptrace.h | 2 +- lib/arm/asm/setup.h | 11 +- lib/arm/eabi_compat.c | 2 +- lib/arm/io.c | 10 +- lib/arm/mmu.c | 82 ++ lib/arm/processor.c | 6 +- lib/arm/setup.c | 19 ++-- lib/arm/spinlock.c| 8 +- lib/arm64/.gitignore | 1 + lib/arm64/asm-offsets.c | 30 + lib/arm64/asm/asm-offsets.h | 1 + lib/arm64/asm/barrier.h | 17 +++ lib/arm64/asm/esr.h | 43 +++ lib/arm64/asm/io.h| 84 ++ lib/arm64/asm/mmu-api.h | 1 + lib/arm64/asm/mmu.h | 24 lib/arm64/asm/page.h | 65 +++ lib/arm64/asm/pgtable-hwdef.h | 136 +++ lib/arm64/asm/pgtable.h | 69 lib/arm64/asm/processor.h | 66 +++ lib/arm64/asm/ptrace.h| 95 lib/arm64/asm/setup.h | 1 + lib/arm64/asm/spinlock.h | 15 +++ lib/arm64/processor.c | 192 lib/chr-testdev.c | 4 +- lib/kbuild.h | 8 ++ lib/virtio.c | 2 +- lib/virtio.h | 3 +- 48 files changed, 1638 insertions(+), 191 deletions(-) create mode 100644 arm/cstart64.S create mode 100644 config/config-arm-common.mak create mode 100644 config/config-arm64.mak create mode 100644 lib/arm/asm/mmu-api.h create mode 100644 lib/arm/asm/pgtable.h create mode 100644 lib/arm64/.gitignore create mode 100644 lib/arm64/asm-offsets.c create mode 100644 lib/arm64/asm/asm-offsets.h create mode 100644 lib/arm64/asm/barrier.h create mode 100644 lib/arm64/asm/esr.h create mode 100644 lib/arm64/asm/io.h create mode 100644 lib/arm64/asm/mmu-api.h create mode 100644 lib/arm64/asm/mmu.h create mode 100644 lib/arm64/asm/page.h create mode 100644 lib/arm64/asm/pgtable-hwdef.h create mode 100644 lib/arm64/asm/pgtable.h create mode 100644 lib/arm64/asm/processor.h create mode 100644 lib/arm64/asm/ptrace.h create mode 100644 lib/arm64/asm/setup.h create mode 100644 lib/arm64/asm/spinlock.h create mode 100644 lib/arm64/processor.c create mode 100644 lib/kbuild.h -- 1.9.3 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 15/15] arm64: enable mmu
Implement asm_mmu_enable and flush_tlb_all, and then make a final change to mmu.c in order to link it into arm64. The final change is to map the code read-only. This is necessary because armv8 forces all writable code shared between EL1 and EL0 to be PXN. Signed-off-by: Andrew Jones drjo...@redhat.com --- arm/cstart64.S | 64 arm/flat.lds | 1 + config/config-arm-common.mak | 1 + config/config-arm.mak| 1 - lib/arm/asm/mmu.h| 1 + lib/arm/mmu.c| 10 ++- lib/arm64/asm/mmu-api.h | 1 + lib/arm64/asm/mmu.h | 16 +++ lib/arm64/asm/processor.h| 14 ++ lib/arm64/processor.c| 26 +- 10 files changed, 127 insertions(+), 8 deletions(-) create mode 100644 lib/arm64/asm/mmu-api.h diff --git a/arm/cstart64.S b/arm/cstart64.S index d1860a94fb2d3..5151f4c77d745 100644 --- a/arm/cstart64.S +++ b/arm/cstart64.S @@ -8,6 +8,9 @@ #define __ASSEMBLY__ #include asm/asm-offsets.h #include asm/ptrace.h +#include asm/processor.h +#include asm/page.h +#include asm/pgtable-hwdef.h .section .init @@ -55,6 +58,67 @@ halt: b 1b /* + * asm_mmu_enable + * Inputs: + * x0 is the base address of the translation table + * Outputs: none + * + * Adapted from + * arch/arm64/kernel/head.S + * arch/arm64/mm/proc.S + */ + +/* + * Memory region attributes for LPAE: + * + * n = AttrIndx[2:0] + * n MAIR + * DEVICE_nGnRnE 000 + * DEVICE_nGnRE 001 0100 + * DEVICE_GRE 010 1100 + * NORMAL_NC 011 01000100 + * NORMAL 100 + */ +#define MAIR(attr, mt) ((attr) ((mt) * 8)) + +.globl asm_mmu_enable +asm_mmu_enable: + ic iallu // I+BTB cache invalidate + tlbivmalle1is // invalidate I + D TLBs + dsb ish + + /* TCR */ + ldr x1, =TCR_TxSZ(VA_BITS) |\ +TCR_TG0_64K | TCR_TG1_64K |\ +TCR_IRGN_WBWA | TCR_ORGN_WBWA |\ +TCR_SHARED + mov x2, #3 // 011 is 42 bits + bfi x1, x2, #32, #3 + msr tcr_el1, x1 + + /* MAIR */ + ldr x1, =MAIR(0x00, MT_DEVICE_nGnRnE) | \ +MAIR(0x04, MT_DEVICE_nGnRE) | \ +MAIR(0x0c, MT_DEVICE_GRE) |\ +MAIR(0x44, MT_NORMAL_NC) | \ +MAIR(0xff, MT_NORMAL) + msr mair_el1, x1 + + /* TTBR0 */ + msr ttbr0_el1, x0 + isb + + /* SCTLR */ + mrs x1, sctlr_el1 + orr x1, x1, SCTLR_EL1_C + orr x1, x1, SCTLR_EL1_I + orr x1, x1, SCTLR_EL1_M + msr sctlr_el1, x1 + isb + + ret + +/* * Vectors * Adapted from arch/arm64/kernel/entry.S */ diff --git a/arm/flat.lds b/arm/flat.lds index 89a55720d728f..a8849ee0939a8 100644 --- a/arm/flat.lds +++ b/arm/flat.lds @@ -3,6 +3,7 @@ SECTIONS { .text : { *(.init) *(.text) *(.text.*) } . = ALIGN(64K); +etext = .; .data : { exception_stacks = .; . += 64K; diff --git a/config/config-arm-common.mak b/config/config-arm-common.mak index b61a2a6044ab2..b01e9ab836b2d 100644 --- a/config/config-arm-common.mak +++ b/config/config-arm-common.mak @@ -33,6 +33,7 @@ cflatobjs += lib/virtio-mmio.o cflatobjs += lib/chr-testdev.o cflatobjs += lib/arm/io.o cflatobjs += lib/arm/setup.o +cflatobjs += lib/arm/mmu.o libeabi = lib/arm/libeabi.a eabiobjs = lib/arm/eabi_compat.o diff --git a/config/config-arm.mak b/config/config-arm.mak index 96686fb639d2d..16e2cb5c103a3 100644 --- a/config/config-arm.mak +++ b/config/config-arm.mak @@ -15,7 +15,6 @@ CFLAGS += -mcpu=$(PROCESSOR) cstart.o = $(TEST_DIR)/cstart.o cflatobjs += lib/arm/spinlock.o cflatobjs += lib/arm/processor.o -cflatobjs += lib/arm/mmu.o # arm specific tests tests = diff --git a/lib/arm/asm/mmu.h b/lib/arm/asm/mmu.h index 5ec7a6ce5886b..c1bd01c9ee1b9 100644 --- a/lib/arm/asm/mmu.h +++ b/lib/arm/asm/mmu.h @@ -9,6 +9,7 @@ #include asm/barrier.h #define PTE_USER L_PTE_USER +#define PTE_RDONLY PTE_AP2 #define PTE_SHARED L_PTE_SHARED #define PTE_AF PTE_EXT_AF #define PTE_WBWA L_PTE_MT_WRITEALLOC diff --git a/lib/arm/mmu.c b/lib/arm/mmu.c index 55d18a10e1ebd..1c024538663ce 100644 --- a/lib/arm/mmu.c +++ b/lib/arm/mmu.c @@ -8,6 +8,8 @@ #include asm/setup.h #include asm/mmu.h +extern unsigned long etext; + pgd_t *mmu_idmap; static bool mmu_on; @@ -72,13 +74,19 @@ void mmu_enable_idmap(void) { unsigned long phys_end = sizeof(long) == 8 || !(PHYS_END 32) ? PHYS_END : 0xf000; + unsigned long code_end = (unsigned long)etext;
[PATCH 04/15] Makefile: cscope may need to look in lib/$ARCH
When $ARCH != $TEST_DIR we should look there too. This patch cheats though and makes cscope always look there, but then gets rid of the duplicates generated when $ARCH == $TEST_DIR. Signed-off-by: Andrew Jones drjo...@redhat.com --- Makefile | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Makefile b/Makefile index dd7e6e94bfe7b..4f28f072ae3d7 100644 --- a/Makefile +++ b/Makefile @@ -82,6 +82,6 @@ distclean: clean libfdt_clean cscope: common_dirs = lib lib/libfdt lib/asm lib/asm-generic cscope: $(RM) ./cscope.* - find -L $(TEST_DIR) lib/$(TEST_DIR) $(common_dirs) -maxdepth 1 \ - -name '*.[chsS]' -print | sed 's,^\./,,' ./cscope.files + find -L $(TEST_DIR) lib/$(TEST_DIR) lib/$(ARCH) $(common_dirs) -maxdepth 1 \ + -name '*.[chsS]' -print | sed 's,^\./,,' | sort -u ./cscope.files cscope -bk -- 1.9.3 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 07/15] arm: selftest: rename svc mode to kernel mode
Separate the concepts of an 'svc', the syscall instruction present on both arm and arm64, and 'svc mode', which is arm's kernel mode, and doesn't exist on arm64. kernel mode on arm64 is modeled with exception level 1 (el1). Signed-off-by: Andrew Jones drjo...@redhat.com --- arm/selftest.c| 4 ++-- arm/unittests.cfg | 12 ++-- 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/arm/selftest.c b/arm/selftest.c index 0de794ea7d696..885a54fee0e4a 100644 --- a/arm/selftest.c +++ b/arm/selftest.c @@ -195,11 +195,11 @@ int main(int argc, char **argv) check_setup(argc-1, argv[1]); - } else if (strcmp(argv[0], vectors-svc) == 0) { + } else if (strcmp(argv[0], vectors-kernel) == 0) { check_vectors(NULL); - } else if (strcmp(argv[0], vectors-usr) == 0) { + } else if (strcmp(argv[0], vectors-user) == 0) { void *sp = memalign(PAGE_SIZE, PAGE_SIZE); memset(sp, 0, PAGE_SIZE); diff --git a/arm/unittests.cfg b/arm/unittests.cfg index 57f5f90f3e808..efcca6bf24af6 100644 --- a/arm/unittests.cfg +++ b/arm/unittests.cfg @@ -17,14 +17,14 @@ smp = 1 extra_params = -m 256 -append 'setup smp=1 mem=256' groups = selftest -# Test vector setup and exception handling (svc mode). -[selftest::vectors-svc] +# Test vector setup and exception handling (kernel mode). +[selftest::vectors-kernel] file = selftest.flat -extra_params = -append 'vectors-svc' +extra_params = -append 'vectors-kernel' groups = selftest -# Test vector setup and exception handling (usr mode). -[selftest::vectors-usr] +# Test vector setup and exception handling (user mode). +[selftest::vectors-user] file = selftest.flat -extra_params = -append 'vectors-usr' +extra_params = -append 'vectors-user' groups = selftest -- 1.9.3 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 09/15] arm64: vectors support
Signed-off-by: Andrew Jones drjo...@redhat.com --- arm/cstart64.S| 142 ++- arm/selftest.c| 129 --- arm/unittests.cfg | 2 - config/config-arm64.mak | 1 + lib/arm64/asm-offsets.c | 16 + lib/arm64/asm/esr.h | 43 lib/arm64/asm/processor.h | 52 ++ lib/arm64/asm/ptrace.h| 95 ++ lib/arm64/processor.c | 168 ++ 9 files changed, 637 insertions(+), 11 deletions(-) create mode 100644 lib/arm64/asm/esr.h create mode 100644 lib/arm64/asm/processor.h create mode 100644 lib/arm64/asm/ptrace.h create mode 100644 lib/arm64/processor.c diff --git a/arm/cstart64.S b/arm/cstart64.S index 1d98066d0e187..d1860a94fb2d3 100644 --- a/arm/cstart64.S +++ b/arm/cstart64.S @@ -7,6 +7,7 @@ */ #define __ASSEMBLY__ #include asm/asm-offsets.h +#include asm/ptrace.h .section .init @@ -26,7 +27,7 @@ start: msr cpacr_el1, x0 /* set up exception handling */ -// bl exceptions_init + bl exceptions_init /* complete setup */ ldp x0, x1, [sp], #16 @@ -40,9 +41,148 @@ start: bl exit b halt +exceptions_init: + adr x0, vector_table + msr vbar_el1, x0 + isb + ret + .text .globl halt halt: 1: wfi b 1b + +/* + * Vectors + * Adapted from arch/arm64/kernel/entry.S + */ +.macro vector_stub, name, vec +\name: + stp x0, x1, [sp, #-S_FRAME_SIZE]! + stp x2, x3, [sp, #16] + stp x4, x5, [sp, #32] + stp x6, x7, [sp, #48] + stp x8, x9, [sp, #64] + stp x10, x11, [sp, #80] + stp x12, x13, [sp, #96] + stp x14, x15, [sp, #112] + stp x16, x17, [sp, #128] + stp x18, x19, [sp, #144] + stp x20, x21, [sp, #160] + stp x22, x23, [sp, #176] + stp x24, x25, [sp, #192] + stp x26, x27, [sp, #208] + stp x28, x29, [sp, #224] + + str x30, [sp, #S_LR] + + .if \vec = 8 + mrs x1, sp_el0 + .else + add x1, sp, #S_FRAME_SIZE + .endif + str x1, [sp, #S_SP] + + mrs x1, elr_el1 + mrs x2, spsr_el1 + stp x1, x2, [sp, #S_PC] + + and x2, x2, #PSR_MODE_MASK + cmp x2, #PSR_MODE_EL0t + b.ne1f + adr x2, user_mode + str xzr, [x2] /* we're in kernel mode now */ + +1: mov x0, \vec + mov x1, sp + mrs x2, esr_el1 + bl do_handle_exception + + ldp x1, x2, [sp, #S_PC] + msr spsr_el1, x2 + msr elr_el1, x1 + + and x2, x2, #PSR_MODE_MASK + cmp x2, #PSR_MODE_EL0t + b.ne1f + adr x2, user_mode + mov x1, #1 + str x1, [x2]/* we're going back to user mode */ + +1: + .if \vec = 8 + ldr x1, [sp, #S_SP] + msr sp_el0, x1 + .endif + + ldr x30, [sp, #S_LR] + + ldp x28, x29, [sp, #224] + ldp x26, x27, [sp, #208] + ldp x24, x25, [sp, #192] + ldp x22, x23, [sp, #176] + ldp x20, x21, [sp, #160] + ldp x18, x19, [sp, #144] + ldp x16, x17, [sp, #128] + ldp x14, x15, [sp, #112] + ldp x12, x13, [sp, #96] + ldp x10, x11, [sp, #80] + ldp x8, x9, [sp, #64] + ldp x6, x7, [sp, #48] + ldp x4, x5, [sp, #32] + ldp x2, x3, [sp, #16] + ldp x0, x1, [sp], #S_FRAME_SIZE + + eret +.endm + +vector_stubel1t_sync, 0 +vector_stubel1t_irq, 1 +vector_stubel1t_fiq, 2 +vector_stubel1t_error,3 + +vector_stubel1h_sync, 4 +vector_stubel1h_irq, 5 +vector_stubel1h_fiq, 6 +vector_stubel1h_error,7 + +vector_stubel0_sync_64, 8 +vector_stubel0_irq_64,9 +vector_stubel0_fiq_64, 10 +vector_stubel0_error_64, 11 + +vector_stubel0_sync_32, 12 +vector_stubel0_irq_32, 13 +vector_stubel0_fiq_32, 14 +vector_stubel0_error_32, 15 + +.section .text.ex + +.macro ventry, label +.align 7 + b \label +.endm + +.align 11 +vector_table: + ventry el1t_sync // Synchronous EL1t + ventry el1t_irq// IRQ EL1t + ventry el1t_fiq// FIQ EL1t + ventry el1t_error // Error EL1t + + ventry el1h_sync // Synchronous EL1h + ventry el1h_irq// IRQ EL1h + ventry el1h_fiq// FIQ EL1h + ventry el1h_error // Error EL1h + + ventry el0_sync_64 // Synchronous 64-bit EL0 +
[PATCH 13/15] arm64: import some Linux page table API
Signed-off-by: Andrew Jones drjo...@redhat.com --- lib/arm64/asm/page.h | 66 +++- lib/arm64/asm/pgtable-hwdef.h | 136 ++ lib/arm64/asm/pgtable.h | 69 + 3 files changed, 270 insertions(+), 1 deletion(-) create mode 100644 lib/arm64/asm/pgtable-hwdef.h create mode 100644 lib/arm64/asm/pgtable.h diff --git a/lib/arm64/asm/page.h b/lib/arm64/asm/page.h index 395760cad5f82..29ad1f1f720c4 100644 --- a/lib/arm64/asm/page.h +++ b/lib/arm64/asm/page.h @@ -1 +1,65 @@ -#include ../../arm/asm/page.h +#ifndef _ASMARM64_PAGE_H_ +#define _ASMARM64_PAGE_H_ +/* + * Adapted from + * arch/arm64/include/asm/pgtable-types.h + * include/asm-generic/pgtable-nopud.h + * include/asm-generic/pgtable-nopmd.h + * + * Copyright (C) 2014, Red Hat Inc, Andrew Jones drjo...@redhat.com + * + * This work is licensed under the terms of the GNU LGPL, version 2. + */ + +#include const.h + +#define PGTABLE_LEVELS 2 +#define VA_BITS42 + +#define PAGE_SHIFT 16 +#define PAGE_SIZE (_AC(1,UL) PAGE_SHIFT) +#define PAGE_MASK (~(PAGE_SIZE-1)) + +#ifndef __ASSEMBLY__ + +#define PAGE_ALIGN(addr) ALIGN(addr, PAGE_SIZE) + +#include alloc.h + +typedef u64 pteval_t; +typedef u64 pmdval_t; +typedef u64 pudval_t; +typedef u64 pgdval_t; +typedef struct { pteval_t pte; } pte_t; +typedef struct { pgdval_t pgd; } pgd_t; +typedef struct { pteval_t pgprot; } pgprot_t; + +#define pte_val(x) ((x).pte) +#define pgd_val(x) ((x).pgd) +#define pgprot_val(x) ((x).pgprot) + +#define __pte(x) ((pte_t) { (x) } ) +#define __pgd(x) ((pgd_t) { (x) } ) +#define __pgprot(x)((pgprot_t) { (x) } ) + +typedef struct { pgd_t pgd; } pud_t; +#define pud_val(x) (pgd_val((x).pgd)) +#define __pud(x) ((pud_t) { __pgd(x) } ) + +typedef struct { pud_t pud; } pmd_t; +#define pmd_val(x) (pud_val((x).pud)) +#define __pmd(x) ((pmd_t) { __pud(x) } ) + +#ifndef __virt_to_phys +#define __phys_to_virt(x) ((unsigned long) (x)) +#define __virt_to_phys(x) (x) +#endif + +#define __va(x)((void *)__phys_to_virt((phys_addr_t)(x))) +#define __pa(x)__virt_to_phys((unsigned long)(x)) + +#define virt_to_pfn(kaddr) (__pa(kaddr) PAGE_SHIFT) +#define pfn_to_virt(pfn) __va((pfn) PAGE_SHIFT) + +#endif /* !__ASSEMBLY__ */ +#endif /* _ASMARM64_PAGE_H_ */ diff --git a/lib/arm64/asm/pgtable-hwdef.h b/lib/arm64/asm/pgtable-hwdef.h new file mode 100644 index 0..20ac9fa402987 --- /dev/null +++ b/lib/arm64/asm/pgtable-hwdef.h @@ -0,0 +1,136 @@ +#ifndef _ASMARM64_PGTABLE_HWDEF_H_ +#define _ASMARM64_PGTABLE_HWDEF_H_ +/* + * From arch/arm64/include/asm/pgtable-hwdef.h + * arch/arm64/include/asm/memory.h + */ +#define UL(x) _AC(x, UL) + +#define PTRS_PER_PTE (1 (PAGE_SHIFT - 3)) + +/* + * PGDIR_SHIFT determines the size a top-level page table entry can map + * (depending on the configuration, this level can be 0, 1 or 2). + */ +#define PGDIR_SHIFT((PAGE_SHIFT - 3) * PGTABLE_LEVELS + 3) +#define PGDIR_SIZE (_AC(1, UL) PGDIR_SHIFT) +#define PGDIR_MASK (~(PGDIR_SIZE-1)) +#define PTRS_PER_PGD (1 (VA_BITS - PGDIR_SHIFT)) + +/* From include/asm-generic/pgtable-nopud.h */ +#define PUD_SHIFT PGDIR_SHIFT +#define PTRS_PER_PUD 1 +#define PUD_SIZE (UL(1) PUD_SHIFT) +#define PUD_MASK (~(PUD_SIZE-1)) +/* From include/asm-generic/pgtable-nopmd.h */ +#define PMD_SHIFT PUD_SHIFT +#define PTRS_PER_PMD 1 +#define PMD_SIZE (UL(1) PMD_SHIFT) +#define PMD_MASK (~(PMD_SIZE-1)) + +/* + * Section address mask and size definitions. + */ +#define SECTION_SHIFT PMD_SHIFT +#define SECTION_SIZE (_AC(1, UL) SECTION_SHIFT) +#define SECTION_MASK (~(SECTION_SIZE-1)) + +/* + * Hardware page table definitions. + * + * Level 1 descriptor (PUD). + */ +#define PUD_TYPE_TABLE (_AT(pudval_t, 3) 0) +#define PUD_TABLE_BIT (_AT(pgdval_t, 1) 1) +#define PUD_TYPE_MASK (_AT(pgdval_t, 3) 0) +#define PUD_TYPE_SECT (_AT(pgdval_t, 1) 0) + +/* + * Level 2 descriptor (PMD). + */ +#define PMD_TYPE_MASK (_AT(pmdval_t, 3) 0) +#define PMD_TYPE_FAULT (_AT(pmdval_t, 0) 0) +#define PMD_TYPE_TABLE (_AT(pmdval_t, 3) 0) +#define PMD_TYPE_SECT (_AT(pmdval_t, 1) 0) +#define PMD_TABLE_BIT (_AT(pmdval_t, 1) 1) + +/* + * Section + */ +#define PMD_SECT_VALID (_AT(pmdval_t, 1) 0) +#define PMD_SECT_PROT_NONE (_AT(pmdval_t, 1) 58) +#define PMD_SECT_USER (_AT(pmdval_t, 1) 6) /* AP[1] */ +#define PMD_SECT_RDONLY(_AT(pmdval_t, 1) 7) /* AP[2] */ +#define PMD_SECT_S
[PATCH 11/15] arm: import more linux page table api
To use page level descriptors we need some pgd/pud/pmd/pte methods, and a few more flags defined. Signed-off-by: Andrew Jones drjo...@redhat.com --- lib/arm/asm/mmu.h | 16 +--- lib/arm/asm/pgtable-hwdef.h | 38 ++- lib/arm/asm/pgtable.h | 91 + 3 files changed, 129 insertions(+), 16 deletions(-) create mode 100644 lib/arm/asm/pgtable.h diff --git a/lib/arm/asm/mmu.h b/lib/arm/asm/mmu.h index 8090a1b554820..254c29f84fe6f 100644 --- a/lib/arm/asm/mmu.h +++ b/lib/arm/asm/mmu.h @@ -5,22 +5,8 @@ * * This work is licensed under the terms of the GNU LGPL, version 2. */ -#include asm/setup.h +#include asm/pgtable.h #include asm/barrier.h -#include alloc.h - -#define PTRS_PER_PGD 4 -#define PGDIR_SHIFT30 -#define PGDIR_SIZE (1UL PGDIR_SHIFT) -#define PGDIR_MASK (~((1 PGDIR_SHIFT) - 1)) - -#define pgd_free(pgd) free(pgd) -static inline pgd_t *pgd_alloc(void) -{ - pgd_t *pgd = memalign(L1_CACHE_BYTES, PTRS_PER_PGD * sizeof(pgd_t)); - memset(pgd, 0, PTRS_PER_PGD * sizeof(pgd_t)); - return pgd; -} static inline void local_flush_tlb_all(void) { diff --git a/lib/arm/asm/pgtable-hwdef.h b/lib/arm/asm/pgtable-hwdef.h index b6850f64b0f52..13a273d36e8fe 100644 --- a/lib/arm/asm/pgtable-hwdef.h +++ b/lib/arm/asm/pgtable-hwdef.h @@ -1,9 +1,45 @@ #ifndef _ASMARM_PGTABLE_HWDEF_H_ #define _ASMARM_PGTABLE_HWDEF_H_ /* - * From arch/arm/include/asm/pgtable-3level-hwdef.h + * From arch/arm/include/asm/pgtable-3level.h + * arch/arm/include/asm/pgtable-3level-hwdef.h */ +#define PTRS_PER_PGD 4 +#define PGDIR_SHIFT30 +#define PGDIR_SIZE (_AC(1,UL) PGDIR_SHIFT) +#define PGDIR_MASK (~((1 PGDIR_SHIFT) - 1)) + +#define PTRS_PER_PTE 512 +#define PTRS_PER_PMD 512 + +#define PMD_SHIFT 21 +#define PMD_SIZE (_AC(1,UL) PMD_SHIFT) +#define PMD_MASK (~((1 PMD_SHIFT) - 1)) + +#define L_PMD_SECT_VALID (_AT(pmdval_t, 1) 0) + +#define L_PTE_VALID(_AT(pteval_t, 1) 0) /* Valid */ +#define L_PTE_PRESENT (_AT(pteval_t, 3) 0) /* Present */ +#define L_PTE_USER (_AT(pteval_t, 1) 6) /* AP[1] */ +#define L_PTE_SHARED (_AT(pteval_t, 3) 8) /* SH[1:0], inner shareable */ +#define L_PTE_YOUNG(_AT(pteval_t, 1) 10)/* AF */ +#define L_PTE_XN (_AT(pteval_t, 1) 54)/* XN */ + +/* + * AttrIndx[2:0] encoding (mapping attributes defined in the MAIR* registers). + */ +#define L_PTE_MT_UNCACHED (_AT(pteval_t, 0) 2) /* strongly ordered */ +#define L_PTE_MT_BUFFERABLE(_AT(pteval_t, 1) 2) /* normal non-cacheable */ +#define L_PTE_MT_WRITETHROUGH (_AT(pteval_t, 2) 2) /* normal inner write-through */ +#define L_PTE_MT_WRITEBACK (_AT(pteval_t, 3) 2) /* normal inner write-back */ +#define L_PTE_MT_WRITEALLOC(_AT(pteval_t, 7) 2) /* normal inner write-alloc */ +#define L_PTE_MT_DEV_SHARED(_AT(pteval_t, 4) 2) /* device */ +#define L_PTE_MT_DEV_NONSHARED (_AT(pteval_t, 4) 2) /* device */ +#define L_PTE_MT_DEV_WC(_AT(pteval_t, 1) 2) /* normal non-cacheable */ +#define L_PTE_MT_DEV_CACHED(_AT(pteval_t, 3) 2) /* normal inner write-back */ +#define L_PTE_MT_MASK (_AT(pteval_t, 7) 2) + /* * Hardware page table definitions. * diff --git a/lib/arm/asm/pgtable.h b/lib/arm/asm/pgtable.h new file mode 100644 index 0..8a730f44e537b --- /dev/null +++ b/lib/arm/asm/pgtable.h @@ -0,0 +1,91 @@ +#ifndef _ASMARM_PGTABLE_H_ +#define _ASMARM_PGTABLE_H_ +/* + * Adapted from arch/arm/include/asm/pgtable.h + * arch/arm/include/asm/pgtable-3level.h + * arch/arm/include/asm/pgalloc.h + * include/asm-generic/pgtable-nopud.h + * + * Note: some Linux function APIs have been modified. Nothing crazy, + * but if a function took, for example, an mm_struct, then + * that was either removed or replaced. + */ +#include alloc.h +#include asm/setup.h +#include asm/page.h +#include asm/pgtable-hwdef.h + +#define pgd_none(pgd) (!pgd_val(pgd)) +#define pud_none(pud) (!pud_val(pud)) +#define pmd_none(pmd) (!pmd_val(pmd)) +#define pte_none(pte) (!pte_val(pte)) + +#define pgd_index(addr) \ + (((addr) PGDIR_SHIFT) (PTRS_PER_PGD - 1)) +#define pgd_offset(pgtable, addr) ((pgtable) + pgd_index(addr)) + +#define pgd_free(pgd) free(pgd) +static inline pgd_t *pgd_alloc(void) +{ + pgd_t *pgd = memalign(L1_CACHE_BYTES, PTRS_PER_PGD * sizeof(pgd_t)); + memset(pgd, 0, PTRS_PER_PGD * sizeof(pgd_t)); + return pgd; +} + +#define pud_offset(pgd, addr) ((pud_t *)pgd) +#define pud_free(pud) +#define pud_alloc(pgd, addr) pud_offset(pgd, addr) + +static inline pmd_t *pud_page_vaddr(pud_t pud) +{ + return __va(pud_val(pud) PHYS_MASK (s32)PAGE_MASK); +} + +#define
[PATCH 12/15] arm: prepare mmu code for arm64
* don't assume 1G PGDIR_SIZE or L1_CACHE_BYTES pgd alignment * use page level descriptors for non-I/O memory * apply new pgd/pud/pmd/pte methods * split mmu.h to share function declarations * use more generic flag names in mmu.c Signed-off-by: Andrew Jones drjo...@redhat.com --- lib/arm/asm/mmu-api.h | 14 +++ lib/arm/asm/mmu.h | 10 +--- lib/arm/asm/setup.h | 3 +++ lib/arm/mmu.c | 69 +-- 4 files changed, 74 insertions(+), 22 deletions(-) create mode 100644 lib/arm/asm/mmu-api.h diff --git a/lib/arm/asm/mmu-api.h b/lib/arm/asm/mmu-api.h new file mode 100644 index 0..f2511e3dc7dee --- /dev/null +++ b/lib/arm/asm/mmu-api.h @@ -0,0 +1,14 @@ +#ifndef __ASMARM_MMU_API_H_ +#define __ASMARM_MMU_API_H_ +extern pgd_t *mmu_idmap; +extern bool mmu_enabled(void); +extern void mmu_enable(pgd_t *pgtable); +extern void mmu_enable_idmap(void); +extern void mmu_init_io_sect(pgd_t *pgtable, unsigned long virt_offset); +extern void mmu_set_range_sect(pgd_t *pgtable, unsigned long virt_offset, + unsigned long phys_start, unsigned long phys_end, + pgprot_t prot); +extern void mmu_set_range_ptes(pgd_t *pgtable, unsigned long virt_offset, + unsigned long phys_start, unsigned long phys_end, + pgprot_t prot); +#endif diff --git a/lib/arm/asm/mmu.h b/lib/arm/asm/mmu.h index 254c29f84fe6f..5ec7a6ce5886b 100644 --- a/lib/arm/asm/mmu.h +++ b/lib/arm/asm/mmu.h @@ -8,6 +8,11 @@ #include asm/pgtable.h #include asm/barrier.h +#define PTE_USER L_PTE_USER +#define PTE_SHARED L_PTE_SHARED +#define PTE_AF PTE_EXT_AF +#define PTE_WBWA L_PTE_MT_WRITEALLOC + static inline void local_flush_tlb_all(void) { asm volatile(mcr p15, 0, %0, c8, c7, 0 :: r (0)); @@ -21,9 +26,6 @@ static inline void flush_tlb_all(void) local_flush_tlb_all(); } -extern bool mmu_enabled(void); -extern void mmu_enable(pgd_t *pgtable); -extern void mmu_enable_idmap(void); -extern void mmu_init_io_sect(pgd_t *pgtable); +#include asm/mmu-api.h #endif /* __ASMARM_MMU_H_ */ diff --git a/lib/arm/asm/setup.h b/lib/arm/asm/setup.h index 450501cc6e8e3..02b668672fca4 100644 --- a/lib/arm/asm/setup.h +++ b/lib/arm/asm/setup.h @@ -17,6 +17,9 @@ extern phys_addr_t __phys_offset, __phys_end; #define PHYS_OFFSET(__phys_offset) #define PHYS_END (__phys_end) +/* mach-virt reserves the first 1G section for I/O */ +#define PHYS_IO_OFFSET (0UL) +#define PHYS_IO_END(1UL 30) #define L1_CACHE_SHIFT 6 #define L1_CACHE_BYTES (1 L1_CACHE_SHIFT) diff --git a/lib/arm/mmu.c b/lib/arm/mmu.c index 7a975c6708de4..55d18a10e1ebd 100644 --- a/lib/arm/mmu.c +++ b/lib/arm/mmu.c @@ -8,9 +8,9 @@ #include asm/setup.h #include asm/mmu.h -static bool mmu_on; -static pgd_t idmap[PTRS_PER_PGD] __attribute__((aligned(L1_CACHE_BYTES))); +pgd_t *mmu_idmap; +static bool mmu_on; bool mmu_enabled(void) { return mmu_on; @@ -24,29 +24,62 @@ void mmu_enable(pgd_t *pgtable) mmu_on = true; } -void mmu_init_io_sect(pgd_t *pgtable) +void mmu_set_range_ptes(pgd_t *pgtable, unsigned long virt_offset, + unsigned long phys_start, unsigned long phys_end, + pgprot_t prot) { - /* -* mach-virt reserves the first 1G section for I/O -*/ - pgd_val(pgtable[0]) = PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_USER; - pgd_val(pgtable[0]) |= PMD_SECT_UNCACHED; + unsigned long vaddr = virt_offset PAGE_MASK; + unsigned long paddr = phys_start PAGE_MASK; + unsigned long virt_end = phys_end - paddr + vaddr; + + for (; vaddr virt_end; vaddr += PAGE_SIZE, paddr += PAGE_SIZE) { + pgd_t *pgd = pgd_offset(pgtable, vaddr); + pud_t *pud = pud_alloc(pgd, vaddr); + pmd_t *pmd = pmd_alloc(pud, vaddr); + pte_t *pte = pte_alloc(pmd, vaddr); + + pte_val(*pte) = paddr; + pte_val(*pte) |= PTE_TYPE_PAGE | PTE_AF | PTE_SHARED; + pte_val(*pte) |= pgprot_val(prot); + } +} + +void mmu_set_range_sect(pgd_t *pgtable, unsigned long virt_offset, + unsigned long phys_start, unsigned long phys_end, + pgprot_t prot) +{ + unsigned long vaddr = virt_offset PGDIR_MASK; + unsigned long paddr = phys_start PGDIR_MASK; + unsigned long virt_end = phys_end - paddr + vaddr; + + for (; vaddr virt_end; vaddr += PGDIR_SIZE, paddr += PGDIR_SIZE) { + pgd_t *pgd = pgd_offset(pgtable, vaddr); + pgd_val(*pgd) = paddr; + pgd_val(*pgd) |= PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S; + pgd_val(*pgd) |= pgprot_val(prot); + } +} + + +void mmu_init_io_sect(pgd_t *pgtable, unsigned long
[PATCH 01/15] arm: fix run script testdev probing
Using -kernel doesn't force qemu to exit immediately, and thus we hang when trying to run arm/run. Using -initrd works though. Signed-off-by: Andrew Jones drjo...@redhat.com --- arm/run | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arm/run b/arm/run index a714350225597..4c5e52525d687 100755 --- a/arm/run +++ b/arm/run @@ -26,7 +26,7 @@ if ! $qemu $M -device '?' 21 | grep virtconsole /dev/null; then exit 2 fi -if $qemu $M -chardev testdev,id=id -kernel . 21 \ +if $qemu $M -chardev testdev,id=id -initrd . 21 \ | grep backend /dev/null; then echo $qpath doesn't support chr-testdev. Exiting. exit 2 -- 1.9.3 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 03/15] arm: setup: fix type mismatch
Correct a type mismatch in the cpus initialization. Signed-off-by: Andrew Jones drjo...@redhat.com --- lib/arm/setup.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/arm/setup.c b/lib/arm/setup.c index 5fa37ca35f383..50ca4cb9ff99e 100644 --- a/lib/arm/setup.c +++ b/lib/arm/setup.c @@ -22,7 +22,7 @@ extern unsigned long stacktop; extern void io_init(void); extern void setup_args(const char *args); -u32 cpus[NR_CPUS] = { [0 ... NR_CPUS-1] = (~0UL) }; +u32 cpus[NR_CPUS] = { [0 ... NR_CPUS-1] = (~0U) }; int nr_cpus; phys_addr_t __phys_offset, __phys_end; -- 1.9.3 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 14/15] arm64: prepare for 64k pages
This changes the layout for arm too, but that's fine. The only thing to keep in mind is that while arm64 will have a single 64k page for its stack, arm will have 16 4k pages. If the number of stack pages matters, then unit tests that want to work for both arm and arm64, may need to avoid using more than one page, even though the memory is there. Signed-off-by: Andrew Jones drjo...@redhat.com --- arm/flat.lds | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/arm/flat.lds b/arm/flat.lds index ee9fc0ab79abc..89a55720d728f 100644 --- a/arm/flat.lds +++ b/arm/flat.lds @@ -2,10 +2,10 @@ SECTIONS { .text : { *(.init) *(.text) *(.text.*) } -. = ALIGN(4K); +. = ALIGN(64K); .data : { exception_stacks = .; -. += 4K; +. += 64K; exception_stacks_end = .; *(.data) } @@ -13,10 +13,10 @@ SECTIONS .rodata : { *(.rodata) } . = ALIGN(16); .bss : { *(.bss) } -. = ALIGN(4K); +. = ALIGN(64K); edata = .; -. += 8K; -. = ALIGN(4K); +. += 64K; +. = ALIGN(64K); stacktop = .; } -- 1.9.3 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 10/15] arm: get PHYS_MASK from pgtable-hwdef.h
This allows it to be different for arm64, even with setup.h shared. Signed-off-by: Andrew Jones drjo...@redhat.com --- lib/arm/asm/mmu.h | 2 +- lib/arm/asm/page.h | 5 ++--- lib/arm/asm/pgtable-hwdef.h | 6 ++ lib/arm/asm/setup.h | 6 ++ lib/arm/mmu.c | 1 - 5 files changed, 11 insertions(+), 9 deletions(-) diff --git a/lib/arm/asm/mmu.h b/lib/arm/asm/mmu.h index 1117aeaf06a57..8090a1b554820 100644 --- a/lib/arm/asm/mmu.h +++ b/lib/arm/asm/mmu.h @@ -5,7 +5,7 @@ * * This work is licensed under the terms of the GNU LGPL, version 2. */ -#include asm/page.h +#include asm/setup.h #include asm/barrier.h #include alloc.h diff --git a/lib/arm/asm/page.h b/lib/arm/asm/page.h index 304c80b9ddfd7..039e2ddfb8e0f 100644 --- a/lib/arm/asm/page.h +++ b/lib/arm/asm/page.h @@ -16,7 +16,7 @@ #define PAGE_ALIGN(addr) ALIGN(addr, PAGE_SIZE) -#include asm/setup.h +#include alloc.h typedef u64 pteval_t; typedef u64 pmdval_t; @@ -51,6 +51,5 @@ typedef struct { pgd_t pgd; } pud_t; #define virt_to_pfn(kaddr) (__pa(kaddr) PAGE_SHIFT) #define pfn_to_virt(pfn) __va((pfn) PAGE_SHIFT) -#endif /* __ASSEMBLY__ */ - +#endif /* !__ASSEMBLY__ */ #endif /* _ASMARM_PAGE_H_ */ diff --git a/lib/arm/asm/pgtable-hwdef.h b/lib/arm/asm/pgtable-hwdef.h index a2564aaca05a3..b6850f64b0f52 100644 --- a/lib/arm/asm/pgtable-hwdef.h +++ b/lib/arm/asm/pgtable-hwdef.h @@ -62,4 +62,10 @@ #define PTE_EXT_NG (_AT(pteval_t, 1) 11)/* nG */ #define PTE_EXT_XN (_AT(pteval_t, 1) 54)/* XN */ +/* + * 40-bit physical address supported. + */ +#define PHYS_MASK_SHIFT(40) +#define PHYS_MASK ((_AC(1, ULL) PHYS_MASK_SHIFT) - 1) + #endif /* _ASMARM_PGTABLE_HWDEF_H_ */ diff --git a/lib/arm/asm/setup.h b/lib/arm/asm/setup.h index 3ef3b2c99a9de..450501cc6e8e3 100644 --- a/lib/arm/asm/setup.h +++ b/lib/arm/asm/setup.h @@ -6,7 +6,8 @@ * This work is licensed under the terms of the GNU LGPL, version 2. */ #include libcflat.h -#include alloc.h +#include asm/page.h +#include asm/pgtable-hwdef.h #define NR_CPUS8 extern u32 cpus[NR_CPUS]; @@ -16,9 +17,6 @@ extern phys_addr_t __phys_offset, __phys_end; #define PHYS_OFFSET(__phys_offset) #define PHYS_END (__phys_end) -#define PHYS_SHIFT 40 -#define PHYS_SIZE (1ULL PHYS_SHIFT) -#define PHYS_MASK (PHYS_SIZE - 1ULL) #define L1_CACHE_SHIFT 6 #define L1_CACHE_BYTES (1 L1_CACHE_SHIFT) diff --git a/lib/arm/mmu.c b/lib/arm/mmu.c index a42ae84bcec1f..7a975c6708de4 100644 --- a/lib/arm/mmu.c +++ b/lib/arm/mmu.c @@ -7,7 +7,6 @@ */ #include asm/setup.h #include asm/mmu.h -#include asm/pgtable-hwdef.h static bool mmu_on; static pgd_t idmap[PTRS_PER_PGD] __attribute__((aligned(L1_CACHE_BYTES))); -- 1.9.3 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 05/15] arm: use absolute headers
Files in lib/arm including asm/someheader.h will get lib/arm/asm/someheader.h, not lib/asm/someheader.h. So we need to use instead of in order to prepare for headers of the same name, but for a different arch. We change all '#include's of all arm files, as consistency looks better. Signed-off-by: Andrew Jones drjo...@redhat.com --- arm/cstart.S | 6 +++--- arm/selftest.c| 14 +++--- lib/arm/asm-offsets.c | 4 ++-- lib/arm/asm/asm-offsets.h | 2 +- lib/arm/asm/io.h | 8 lib/arm/asm/mmu.h | 6 +++--- lib/arm/asm/page.h| 2 +- lib/arm/asm/processor.h | 2 +- lib/arm/asm/ptrace.h | 2 +- lib/arm/asm/setup.h | 4 ++-- lib/arm/eabi_compat.c | 2 +- lib/arm/io.c | 10 +- lib/arm/mmu.c | 6 +++--- lib/arm/processor.c | 6 +++--- lib/arm/setup.c | 14 +++--- lib/arm/spinlock.c| 8 16 files changed, 48 insertions(+), 48 deletions(-) diff --git a/arm/cstart.S b/arm/cstart.S index a1ccfb24bb4e0..1e3c3a32375fd 100644 --- a/arm/cstart.S +++ b/arm/cstart.S @@ -6,9 +6,9 @@ * This work is licensed under the terms of the GNU LGPL, version 2. */ #define __ASSEMBLY__ -#include asm/asm-offsets.h -#include asm/ptrace.h -#include asm/cp15.h +#include asm/asm-offsets.h +#include asm/ptrace.h +#include asm/cp15.h .arm diff --git a/arm/selftest.c b/arm/selftest.c index 0f70e1dcb3b0e..0de794ea7d696 100644 --- a/arm/selftest.c +++ b/arm/selftest.c @@ -5,13 +5,13 @@ * * This work is licensed under the terms of the GNU LGPL, version 2. */ -#include libcflat.h -#include alloc.h -#include asm/setup.h -#include asm/ptrace.h -#include asm/asm-offsets.h -#include asm/processor.h -#include asm/page.h +#include libcflat.h +#include alloc.h +#include asm/setup.h +#include asm/ptrace.h +#include asm/asm-offsets.h +#include asm/processor.h +#include asm/page.h #define TESTGRP selftest diff --git a/lib/arm/asm-offsets.c b/lib/arm/asm-offsets.c index a9c349d2d427c..76380dfa15ab8 100644 --- a/lib/arm/asm-offsets.c +++ b/lib/arm/asm-offsets.c @@ -5,8 +5,8 @@ * * This work is licensed under the terms of the GNU LGPL, version 2. */ -#include libcflat.h -#include asm/ptrace.h +#include libcflat.h +#include asm/ptrace.h #define DEFINE(sym, val) \ asm volatile(\n- #sym %0 #val : : i (val)) diff --git a/lib/arm/asm/asm-offsets.h b/lib/arm/asm/asm-offsets.h index c2ff2ba6ec417..d370ee36a182b 100644 --- a/lib/arm/asm/asm-offsets.h +++ b/lib/arm/asm/asm-offsets.h @@ -1 +1 @@ -#include generated/asm-offsets.h +#include generated/asm-offsets.h diff --git a/lib/arm/asm/io.h b/lib/arm/asm/io.h index bbcbcd0542490..ba3b0b2412adb 100644 --- a/lib/arm/asm/io.h +++ b/lib/arm/asm/io.h @@ -1,8 +1,8 @@ #ifndef _ASMARM_IO_H_ #define _ASMARM_IO_H_ -#include libcflat.h -#include asm/barrier.h -#include asm/page.h +#include libcflat.h +#include asm/barrier.h +#include asm/page.h #define __iomem #define __force @@ -89,6 +89,6 @@ static inline void *phys_to_virt(phys_addr_t x) return (void *)__phys_to_virt(x); } -#include asm-generic/io.h +#include asm-generic/io.h #endif /* _ASMARM_IO_H_ */ diff --git a/lib/arm/asm/mmu.h b/lib/arm/asm/mmu.h index 451c7493c2aba..1117aeaf06a57 100644 --- a/lib/arm/asm/mmu.h +++ b/lib/arm/asm/mmu.h @@ -5,9 +5,9 @@ * * This work is licensed under the terms of the GNU LGPL, version 2. */ -#include asm/page.h -#include asm/barrier.h -#include alloc.h +#include asm/page.h +#include asm/barrier.h +#include alloc.h #define PTRS_PER_PGD 4 #define PGDIR_SHIFT30 diff --git a/lib/arm/asm/page.h b/lib/arm/asm/page.h index 6ff849a0c0e3b..304c80b9ddfd7 100644 --- a/lib/arm/asm/page.h +++ b/lib/arm/asm/page.h @@ -6,7 +6,7 @@ * This work is licensed under the terms of the GNU LGPL, version 2. */ -#include const.h +#include const.h #define PAGE_SHIFT 12 #define PAGE_SIZE (_AC(1,UL) PAGE_SHIFT) diff --git a/lib/arm/asm/processor.h b/lib/arm/asm/processor.h index 883cab89622f7..a56f8d1fc9797 100644 --- a/lib/arm/asm/processor.h +++ b/lib/arm/asm/processor.h @@ -5,7 +5,7 @@ * * This work is licensed under the terms of the GNU LGPL, version 2. */ -#include ptrace.h +#include asm/ptrace.h enum vector { EXCPTN_RST, diff --git a/lib/arm/asm/ptrace.h b/lib/arm/asm/ptrace.h index 3a4c7532258f6..9ee71c760d22f 100644 --- a/lib/arm/asm/ptrace.h +++ b/lib/arm/asm/ptrace.h @@ -49,7 +49,7 @@ #define PSR_ENDIAN_MASK0x0200 /* Endianness state mask */ #ifndef __ASSEMBLY__ -#include libcflat.h +#include libcflat.h struct pt_regs { unsigned long uregs[18]; diff --git a/lib/arm/asm/setup.h b/lib/arm/asm/setup.h index 21445ef2085fc..3ef3b2c99a9de 100644 --- a/lib/arm/asm/setup.h +++ b/lib/arm/asm/setup.h @@ -5,8 +5,8 @@ * * This work is licensed under the terms of the GNU LGPL, version 2. */ -#include libcflat.h -#include alloc.h +#include
[PATCH 06/15] arm: setup: drop unused arguments
Drop the unused arguments from setup(), passing only the fdt. This allows setup() to be more easily shared with arm64. Signed-off-by: Andrew Jones drjo...@redhat.com --- arm/cstart.S| 12 ++-- lib/arm/setup.c | 3 +-- 2 files changed, 11 insertions(+), 4 deletions(-) diff --git a/arm/cstart.S b/arm/cstart.S index 1e3c3a32375fd..da496e9eae7e0 100644 --- a/arm/cstart.S +++ b/arm/cstart.S @@ -19,15 +19,23 @@ start: /* * bootloader params are in r0-r2 * See the kernel doc Documentation/arm/Booting +* r0 = 0 +* r1 = machine type number +* r2 = physical address of the dtb +* +* As we have no need for r0's nor r1's value, then +* put the dtb in r0. This allows setup to be consistent +* with arm64. */ ldr sp, =stacktop - push{r0-r3} + mov r0, r2 + push{r0-r1} /* set up vector table and mode stacks */ bl exceptions_init /* complete setup */ - pop {r0-r3} + pop {r0-r1} bl setup /* run the test */ diff --git a/lib/arm/setup.c b/lib/arm/setup.c index 9d2094da8a29c..8f58802e958ac 100644 --- a/lib/arm/setup.c +++ b/lib/arm/setup.c @@ -62,8 +62,7 @@ static void mem_init(phys_addr_t freemem_start) mmu_enable_idmap(); } -void setup(unsigned long arg __unused, unsigned long id __unused, - const void *fdt) +void setup(const void *fdt) { const char *bootargs; u32 fdt_size; -- 1.9.3 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 02/15] virtio: don't use size_t
A size_t can have a different size when compiled as 64-bit vs. 32-bit. When unsigned int is what we want, then make sure unsigned int is what we use. Signed-off-by: Andrew Jones drjo...@redhat.com --- lib/chr-testdev.c | 4 ++-- lib/virtio.c | 2 +- lib/virtio.h | 3 ++- 3 files changed, 5 insertions(+), 4 deletions(-) diff --git a/lib/chr-testdev.c b/lib/chr-testdev.c index 0c9a173a04886..c19424fd44b20 100644 --- a/lib/chr-testdev.c +++ b/lib/chr-testdev.c @@ -13,7 +13,7 @@ static struct virtio_device *vcon; static struct virtqueue *in_vq, *out_vq; static struct spinlock lock; -static void __testdev_send(char *buf, size_t len) +static void __testdev_send(char *buf, unsigned int len) { int ret; @@ -29,8 +29,8 @@ static void __testdev_send(char *buf, size_t len) void chr_testdev_exit(int code) { + unsigned int len; char buf[8]; - int len; snprintf(buf, sizeof(buf), %dq, code); len = strlen(buf); diff --git a/lib/virtio.c b/lib/virtio.c index cb496ff2eabd5..9532d1aeb1707 100644 --- a/lib/virtio.c +++ b/lib/virtio.c @@ -47,7 +47,7 @@ void vring_init_virtqueue(struct vring_virtqueue *vq, unsigned index, vq-data[i] = NULL; } -int virtqueue_add_outbuf(struct virtqueue *_vq, char *buf, size_t len) +int virtqueue_add_outbuf(struct virtqueue *_vq, char *buf, unsigned int len) { struct vring_virtqueue *vq = to_vvq(_vq); unsigned avail; diff --git a/lib/virtio.h b/lib/virtio.h index b51899ab998b6..4801e204a469d 100644 --- a/lib/virtio.h +++ b/lib/virtio.h @@ -139,7 +139,8 @@ extern void vring_init_virtqueue(struct vring_virtqueue *vq, unsigned index, bool (*notify)(struct virtqueue *), void (*callback)(struct virtqueue *), const char *name); -extern int virtqueue_add_outbuf(struct virtqueue *vq, char *buf, size_t len); +extern int virtqueue_add_outbuf(struct virtqueue *vq, char *buf, + unsigned int len); extern bool virtqueue_kick(struct virtqueue *vq); extern void detach_buf(struct vring_virtqueue *vq, unsigned head); extern void *virtqueue_get_buf(struct virtqueue *_vq, unsigned int *len); -- 1.9.3 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
kvm-unit-tests: add tscdeadline-latency test
To test latency between TSC deadline timer interrupt injection. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com Index: kvm-unit-tests/config/config-x86-common.mak === --- kvm-unit-tests.orig/config/config-x86-common.mak2014-06-27 13:43:43.694257143 -0300 +++ kvm-unit-tests/config/config-x86-common.mak 2014-12-10 16:10:41.715339378 -0200 @@ -69,6 +69,8 @@ $(TEST_DIR)/apic.elf: $(cstart.o) $(TEST_DIR)/apic.o +$(TEST_DIR)/tscdeadline-latency.elf: $(cstart.o) $(TEST_DIR)/tscdeadline-latency.o + $(TEST_DIR)/init.elf: $(cstart.o) $(TEST_DIR)/init.o $(TEST_DIR)/realmode.elf: $(TEST_DIR)/realmode.o Index: kvm-unit-tests/config/config-x86_64.mak === --- kvm-unit-tests.orig/config/config-x86_64.mak2014-12-10 16:03:20.609681443 -0200 +++ kvm-unit-tests/config/config-x86_64.mak 2014-12-10 16:10:25.172352577 -0200 @@ -9,5 +9,6 @@ $(TEST_DIR)/pcid.flat $(TEST_DIR)/debug.flat tests += $(TEST_DIR)/svm.flat tests += $(TEST_DIR)/vmx.flat +tests += $(TEST_DIR)/tscdeadline-latency.flat include config/config-x86-common.mak Index: kvm-unit-tests/x86/tscdeadline-latency.c === --- /dev/null 1970-01-01 00:00:00.0 + +++ kvm-unit-tests/x86/tscdeadline-latency.c2014-12-10 18:21:38.151253344 -0200 @@ -0,0 +1,110 @@ +/* + * qemu command line | grep latency | cut -f 2 -d : latency + * + * In octave: + * load latency + * min(list) + * max(list) + * mean(list) + * hist(latency, 50) + */ + +#include libcflat.h +#include apic.h +#include vm.h +#include smp.h +#include desc.h +#include isr.h +#include msr.h + +static void test_lapic_existence(void) +{ +u32 lvr; + +lvr = apic_read(APIC_LVR); +printf(apic version: %x\n, lvr); +report(apic existence, (u16)lvr == 0x14); +} + +#define TSC_DEADLINE_TIMER_MODE (2 17) +#define TSC_DEADLINE_TIMER_VECTOR 0xef +#define MSR_IA32_TSC0x0010 +#define MSR_IA32_TSCDEADLINE0x06e0 + +static int tdt_count; +u64 exptime; +int delta; +#define TABLE_SIZE 1 +u64 table[TABLE_SIZE]; +volatile int table_idx; + +static void tsc_deadline_timer_isr(isr_regs_t *regs) +{ +u64 now = rdtsc(); +++tdt_count; + +if (table_idx TABLE_SIZE tdt_count 1) +table[table_idx++] = now - exptime; + +exptime = now+delta; +wrmsr(MSR_IA32_TSCDEADLINE, now+delta); +apic_write(APIC_EOI, 0); +} + +static void start_tsc_deadline_timer(void) +{ +handle_irq(TSC_DEADLINE_TIMER_VECTOR, tsc_deadline_timer_isr); +irq_enable(); + +wrmsr(MSR_IA32_TSCDEADLINE, rdmsr(MSR_IA32_TSC)+delta); +asm volatile (nop); +} + +static int enable_tsc_deadline_timer(void) +{ +uint32_t lvtt; + +if (cpuid(1).c (1 24)) { +lvtt = TSC_DEADLINE_TIMER_MODE | TSC_DEADLINE_TIMER_VECTOR; +apic_write(APIC_LVTT, lvtt); +start_tsc_deadline_timer(); +return 1; +} else { +return 0; +} +} + +static void test_tsc_deadline_timer(void) +{ +if(enable_tsc_deadline_timer()) { +printf(tsc deadline timer enabled\n); +} else { +printf(tsc deadline timer not detected\n); +} +} + +int main() +{ +int i; + +setup_vm(); +smp_init(); +setup_idt(); + +test_lapic_existence(); + +mask_pic_interrupts(); + +delta = 20; +test_tsc_deadline_timer(); +irq_enable(); + +do { +asm volatile(hlt); +} while (table_idx TABLE_SIZE); + +for (i = 0; i TABLE_SIZE; i++) +printf(latency: %d\n, table[i]); + +return report_summary(); +} Index: kvm-unit-tests/x86/unittests.cfg === --- kvm-unit-tests.orig/x86/unittests.cfg 2014-12-10 16:03:20.616681437 -0200 +++ kvm-unit-tests/x86/unittests.cfg2014-12-10 16:15:23.145114609 -0200 @@ -161,3 +161,8 @@ [debug] file = debug.flat arch = x86_64 + +[tscdeadline_latency] +file = tscdeadline_latency.flat +extra_params = -cpu qemu64,+tsc-deadline +arch = x86_64 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 08/15] arm64: initial drop
This is the initial drop of the arm64 test framework and a first test that just checks that setup completed (a selftest). kvm isn't needed to run this test unless testing with smp 1. Try it out with yum install gcc-aarch64-linux-gnu ./configure --cross-prefix=aarch64-linux-gnu- --arch=arm64 make QEMU=[qemu with aarch64, mach-virt, and chr-testdev] ./run_tests.sh Signed-off-by: Andrew Jones drjo...@redhat.com --- arm/cstart64.S | 48 + arm/run | 10 -- arm/selftest.c | 6 arm/unittests.cfg| 2 ++ config/config-arm-common.mak | 68 +++ config/config-arm.mak| 75 +-- config/config-arm64.mak | 20 +++ configure| 12 ++- lib/arm/asm-offsets.c| 7 +--- lib/arm64/.gitignore | 1 + lib/arm64/asm-offsets.c | 14 lib/arm64/asm/asm-offsets.h | 1 + lib/arm64/asm/barrier.h | 17 + lib/arm64/asm/io.h | 84 lib/arm64/asm/mmu.h | 18 ++ lib/arm64/asm/page.h | 1 + lib/arm64/asm/setup.h| 1 + lib/arm64/asm/spinlock.h | 15 lib/kbuild.h | 8 + 19 files changed, 333 insertions(+), 75 deletions(-) create mode 100644 arm/cstart64.S create mode 100644 config/config-arm-common.mak create mode 100644 config/config-arm64.mak create mode 100644 lib/arm64/.gitignore create mode 100644 lib/arm64/asm-offsets.c create mode 100644 lib/arm64/asm/asm-offsets.h create mode 100644 lib/arm64/asm/barrier.h create mode 100644 lib/arm64/asm/io.h create mode 100644 lib/arm64/asm/mmu.h create mode 100644 lib/arm64/asm/page.h create mode 100644 lib/arm64/asm/setup.h create mode 100644 lib/arm64/asm/spinlock.h create mode 100644 lib/kbuild.h diff --git a/arm/cstart64.S b/arm/cstart64.S new file mode 100644 index 0..1d98066d0e187 --- /dev/null +++ b/arm/cstart64.S @@ -0,0 +1,48 @@ +/* + * Boot entry point and assembler functions for aarch64 tests. + * + * Copyright (C) 2014, Red Hat Inc, Andrew Jones drjo...@redhat.com + * + * This work is licensed under the terms of the GNU LGPL, version 2. + */ +#define __ASSEMBLY__ +#include asm/asm-offsets.h + +.section .init + +.globl start +start: + /* +* bootloader params are in x0-x3 +* The physical address of the dtb is in x0, x1-x3 are reserved +* See the kernel doc Documentation/arm64/booting.txt +*/ + adr x4, stacktop + mov sp, x4 + stp x0, x1, [sp, #-16]! + + /* Enable FP/ASIMD */ + mov x0, #(3 20) + msr cpacr_el1, x0 + + /* set up exception handling */ +// bl exceptions_init + + /* complete setup */ + ldp x0, x1, [sp], #16 + bl setup + + /* run the test */ + adr x0, __argc + ldr x0, [x0] + adr x1, __argv + bl main + bl exit + b halt + +.text + +.globl halt +halt: +1: wfi + b 1b diff --git a/arm/run b/arm/run index 4c5e52525d687..662a8564674a3 100755 --- a/arm/run +++ b/arm/run @@ -5,8 +5,9 @@ if [ ! -f config.mak ]; then exit 2 fi source config.mak +processor=$PROCESSOR -qemu=${QEMU:-qemu-system-arm} +qemu=${QEMU:-qemu-system-$ARCH_NAME} qpath=$(which $qemu 2/dev/null) if [ -z $qpath ]; then @@ -36,7 +37,12 @@ M='-machine virt,accel=kvm:tcg' chr_testdev='-device virtio-serial-device' chr_testdev+=' -device virtconsole,chardev=ctd -chardev testdev,id=ctd' -command=$qemu $M -cpu $PROCESSOR $chr_testdev +# arm64 must use '-cpu host' with kvm +if [ $(arch) = aarch64 ] [ $ARCH = arm64 ] [ -c /dev/kvm ]; then + processor=host +fi + +command=$qemu $M -cpu $processor $chr_testdev command+= -display none -serial stdio -kernel echo $command $@ diff --git a/arm/selftest.c b/arm/selftest.c index 885a54fee0e4a..30f44261d47db 100644 --- a/arm/selftest.c +++ b/arm/selftest.c @@ -8,10 +8,12 @@ #include libcflat.h #include alloc.h #include asm/setup.h +#ifdef __arm__ #include asm/ptrace.h #include asm/asm-offsets.h #include asm/processor.h #include asm/page.h +#endif #define TESTGRP selftest @@ -78,6 +80,7 @@ static void check_setup(int argc, char **argv) assert_args(nr_tests, 2); } +#ifdef __arm__ static struct pt_regs expected_regs; /* * Capture the current register state and execute an instruction @@ -184,6 +187,7 @@ static void check_vectors(void *arg __unused) report(%s, check_und() check_svc(), testname); exit(report_summary()); } +#endif int main(int argc, char **argv) { @@ -195,6 +199,7 @@ int main(int argc, char **argv) check_setup(argc-1, argv[1]); +#ifdef __arm__ } else if (strcmp(argv[0], vectors-kernel) == 0) { check_vectors(NULL); @@ -204,6 +209,7 @@
Re: [PATCH] x86, kvm: Clear paravirt_enabled on KVM guests for espfix32's benefit
On Wed, Dec 10, 2014 at 3:49 AM, Paolo Bonzini pbonz...@redhat.com wrote: On 06/12/2014 04:03, Andy Lutomirski wrote: paravirt_enabled has the following effects: - Disables the F00F bug workaround warning. There is no F00F bug workaround any more because Linux's standard IDT handling already works around the F00F bug, but the warning still exists. This is only cosmetic, and, in any event, there is no such thing as KVM on a CPU with the F00F bug. - Disables 32-bit APM BIOS detection. On a KVM paravirt system, there should be no APM BIOS anyway. - Disables tboot. I think that the tboot code should check the CPUID hypervisor bit directly if it matters. - paravirt_enabled disables espfix32. espfix32 should *not* be disabled under KVM paravirt. The last point is the purpose of this patch. It fixes a leak of the high 16 bits of the kernel stack address on 32-bit KVM paravirt guests. While I'm at it, this removes pv_info setup from kvmclock. That code seems to serve no purpose. kvmclock_init runs before kvm_guest_init, and this is a stable@ patch so for the sake of extra safety I've left the pv_info.name assignment in. Applied (locally for now), will be in 3.19. In the interest of reduced future confusion, would it make sense to drop the duplicate initialization for 3.20? --Andy -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 0/2] KVM: add option to advance tscdeadline hrtimer expiration (v3)
See patches for details. v2: - fix email address. v3: - use module parameter for configuration of value (Paolo/Radim) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 1/2] KVM: x86: add method to test PIR bitmap vector
kvm_x86_ops-test_posted_interrupt() returns true/false depending whether 'vector' is set. Next patch makes use of this interface. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com Index: kvm/arch/x86/include/asm/kvm_host.h === --- kvm.orig/arch/x86/include/asm/kvm_host.h +++ kvm/arch/x86/include/asm/kvm_host.h @@ -743,6 +743,7 @@ struct kvm_x86_ops { void (*set_virtual_x2apic_mode)(struct kvm_vcpu *vcpu, bool set); void (*set_apic_access_page_addr)(struct kvm_vcpu *vcpu, hpa_t hpa); void (*deliver_posted_interrupt)(struct kvm_vcpu *vcpu, int vector); + bool (*test_posted_interrupt)(struct kvm_vcpu *vcpu, int vector); void (*sync_pir_to_irr)(struct kvm_vcpu *vcpu); int (*set_tss_addr)(struct kvm *kvm, unsigned int addr); int (*get_tdp_level)(void); Index: kvm/arch/x86/kvm/vmx.c === --- kvm.orig/arch/x86/kvm/vmx.c +++ kvm/arch/x86/kvm/vmx.c @@ -435,6 +435,11 @@ static int pi_test_and_set_pir(int vecto return test_and_set_bit(vector, (unsigned long *)pi_desc-pir); } +static int pi_test_pir(int vector, struct pi_desc *pi_desc) +{ + return test_bit(vector, (unsigned long *)pi_desc-pir); +} + struct vcpu_vmx { struct kvm_vcpu vcpu; unsigned long host_rsp; @@ -5939,6 +5944,7 @@ static __init int hardware_setup(void) else { kvm_x86_ops-hwapic_irr_update = NULL; kvm_x86_ops-deliver_posted_interrupt = NULL; + kvm_x86_ops-test_posted_interrupt = NULL; kvm_x86_ops-sync_pir_to_irr = vmx_sync_pir_to_irr_dummy; } @@ -6960,6 +6966,13 @@ static int handle_invvpid(struct kvm_vcp return 1; } +static bool vmx_test_pir(struct kvm_vcpu *vcpu, int vector) +{ + struct vcpu_vmx *vmx = to_vmx(vcpu); + + return pi_test_pir(vector, vmx-pi_desc); +} + /* * The exit handlers return 1 if the exit was handled fully and guest execution * may resume. Otherwise they set the kvm_run parameter to indicate what needs @@ -9374,6 +9387,7 @@ static struct kvm_x86_ops vmx_x86_ops = .hwapic_isr_update = vmx_hwapic_isr_update, .sync_pir_to_irr = vmx_sync_pir_to_irr, .deliver_posted_interrupt = vmx_deliver_posted_interrupt, + .test_posted_interrupt = vmx_test_pir, .set_tss_addr = vmx_set_tss_addr, .get_tdp_level = get_ept_level, -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 2/2] KVM: x86: add option to advance tscdeadline hrtimer expiration
For the hrtimer which emulates the tscdeadline timer in the guest, add an option to advance expiration, and busy spin on VM-entry waiting for the actual expiration time to elapse. This allows achieving low latencies in cyclictest (or any scenario which requires strict timing regarding timer expiration). Reduces cyclictest avg latency by 50%. Note: this option requires tuning to find the appropriate value for a particular hardware/guest combination. One method is to measure the average delay between apic_timer_fn and VM-entry. Another method is to start with 1000ns, and increase the value in say 500ns increments until avg cyclictest numbers stop decreasing. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com Index: kvm/arch/x86/kvm/lapic.c === --- kvm.orig/arch/x86/kvm/lapic.c +++ kvm/arch/x86/kvm/lapic.c @@ -33,6 +33,7 @@ #include asm/page.h #include asm/current.h #include asm/apicdef.h +#include asm/delay.h #include linux/atomic.h #include linux/jump_label.h #include kvm_cache_regs.h @@ -1073,6 +1074,7 @@ static void apic_timer_expired(struct kv { struct kvm_vcpu *vcpu = apic-vcpu; wait_queue_head_t *q = vcpu-wq; + struct kvm_timer *ktimer = apic-lapic_timer; /* * Note: KVM_REQ_PENDING_TIMER is implicitly checked in @@ -1087,11 +1089,58 @@ static void apic_timer_expired(struct kv if (waitqueue_active(q)) wake_up_interruptible(q); + + if (ktimer-timer_mode_mask == APIC_LVT_TIMER_TSCDEADLINE) + ktimer-expired_tscdeadline = ktimer-tscdeadline; +} + +static bool lapic_timer_int_injected(struct kvm_vcpu *vcpu) +{ + struct kvm_lapic *apic = vcpu-arch.apic; + u32 reg = kvm_apic_get_reg(apic, APIC_LVTT); + + if (kvm_apic_hw_enabled(apic)) { + int vec = reg APIC_VECTOR_MASK; + + if (kvm_x86_ops-test_posted_interrupt) + return kvm_x86_ops-test_posted_interrupt(vcpu, vec); + else { + if (apic_test_vector(vec, apic-regs + APIC_ISR)) + return true; + } + } + return false; +} + +void wait_lapic_expire(struct kvm_vcpu *vcpu) +{ + struct kvm_lapic *apic = vcpu-arch.apic; + u64 guest_tsc, tsc_deadline; + + if (!kvm_vcpu_has_lapic(vcpu)) + return; + + if (!apic_lvtt_tscdeadline(apic)) + return; + + if (!lapic_timer_int_injected(vcpu)) + return; + + tsc_deadline = apic-lapic_timer.expired_tscdeadline; + guest_tsc = kvm_x86_ops-read_l1_tsc(vcpu, native_read_tsc()); + + while (guest_tsc tsc_deadline) { + int delay = min(tsc_deadline - guest_tsc, 1000ULL); + + ndelay(delay); + guest_tsc = kvm_x86_ops-read_l1_tsc(vcpu, native_read_tsc()); + } } static void start_apic_timer(struct kvm_lapic *apic) { ktime_t now; + atomic_set(apic-lapic_timer.pending, 0); if (apic_lvtt_period(apic) || apic_lvtt_oneshot(apic)) { @@ -1137,6 +1186,7 @@ static void start_apic_timer(struct kvm_ /* lapic timer in tsc deadline mode */ u64 guest_tsc, tscdeadline = apic-lapic_timer.tscdeadline; u64 ns = 0; + ktime_t expire; struct kvm_vcpu *vcpu = apic-vcpu; unsigned long this_tsc_khz = vcpu-arch.virtual_tsc_khz; unsigned long flags; @@ -1151,8 +1201,10 @@ static void start_apic_timer(struct kvm_ if (likely(tscdeadline guest_tsc)) { ns = (tscdeadline - guest_tsc) * 100ULL; do_div(ns, this_tsc_khz); + expire = ktime_add_ns(now, ns); + expire = ktime_sub_ns(expire, lapic_timer_advance_ns); hrtimer_start(apic-lapic_timer.timer, - ktime_add_ns(now, ns), HRTIMER_MODE_ABS); + expire, HRTIMER_MODE_ABS); } else apic_timer_expired(apic); Index: kvm/arch/x86/kvm/lapic.h === --- kvm.orig/arch/x86/kvm/lapic.h +++ kvm/arch/x86/kvm/lapic.h @@ -14,6 +14,7 @@ struct kvm_timer { u32 timer_mode; u32 timer_mode_mask; u64 tscdeadline; + u64 expired_tscdeadline; atomic_t pending; /* accumulated triggered timers */ }; @@ -170,4 +171,6 @@ static inline bool kvm_apic_has_events(s bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector); +void wait_lapic_expire(struct kvm_vcpu *vcpu); + #endif Index: kvm/arch/x86/kvm/x86.c === --- kvm.orig/arch/x86/kvm/x86.c +++ kvm/arch/x86/kvm/x86.c @@ -108,6 +108,10 @@ EXPORT_SYMBOL_GPL(kvm_max_guest_tsc_khz)
Re: [PATCH] x86, kvm: Clear paravirt_enabled on KVM guests for espfix32's benefit
In the interest of reduced future confusion, would it make sense to drop the duplicate initialization for 3.20? Yup. It would be great if possible to even unify the two init functions, but I haven't checked what happens in the middle. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH kvm-unit-tests] x86: test_conforming_switch misses es initialization
Applied, thanks. Paolo - Original Message - From: Nadav Amit na...@cs.technion.ac.il To: pbonz...@redhat.com Cc: kvm@vger.kernel.org, Nadav Amit na...@cs.technion.ac.il Sent: Sunday, December 7, 2014 10:39:01 AM Subject: [PATCH kvm-unit-tests] x86: test_conforming_switch misses es initialization test_conforming_switch in the taskswitch2 tests, miss es initialization. Fix it. Signed-off-by: Nadav Amit na...@cs.technion.ac.il --- x86/taskswitch2.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/x86/taskswitch2.c b/x86/taskswitch2.c index f55843c..db3e41a 100644 --- a/x86/taskswitch2.c +++ b/x86/taskswitch2.c @@ -271,7 +271,8 @@ void test_conforming_switch(void) tss_intr.cs = CONFORM_CS_SEL | 3; tss_intr.eip = (u32)user_tss; - tss_intr.ds = tss_intr.gs = tss_intr.fs = tss_intr.ss = USER_DS; + tss_intr.ss = USER_DS; + tss_intr.ds = tss_intr.gs = tss_intr.es = tss_intr.fs = tss_intr.ss; tss_intr.eflags |= 3 IOPL_SHIFT; set_gdt_entry(CONFORM_CS_SEL, 0, 0x, 0x9f, 0xc0); asm volatile(lcall $ xstr(TSS_INTR) , $0xf4f4f4f4); -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kvm-unit-tests: add tscdeadline-latency test
On 10/12/2014 21:23, Marcelo Tosatti wrote: To test latency between TSC deadline timer interrupt injection. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com Index: kvm-unit-tests/config/config-x86-common.mak === --- kvm-unit-tests.orig/config/config-x86-common.mak 2014-06-27 13:43:43.694257143 -0300 +++ kvm-unit-tests/config/config-x86-common.mak 2014-12-10 16:10:41.715339378 -0200 @@ -69,6 +69,8 @@ $(TEST_DIR)/apic.elf: $(cstart.o) $(TEST_DIR)/apic.o +$(TEST_DIR)/tscdeadline-latency.elf: $(cstart.o) $(TEST_DIR)/tscdeadline-latency.o + $(TEST_DIR)/init.elf: $(cstart.o) $(TEST_DIR)/init.o $(TEST_DIR)/realmode.elf: $(TEST_DIR)/realmode.o Index: kvm-unit-tests/config/config-x86_64.mak === --- kvm-unit-tests.orig/config/config-x86_64.mak 2014-12-10 16:03:20.609681443 -0200 +++ kvm-unit-tests/config/config-x86_64.mak 2014-12-10 16:10:25.172352577 -0200 @@ -9,5 +9,6 @@ $(TEST_DIR)/pcid.flat $(TEST_DIR)/debug.flat tests += $(TEST_DIR)/svm.flat tests += $(TEST_DIR)/vmx.flat +tests += $(TEST_DIR)/tscdeadline-latency.flat include config/config-x86-common.mak Index: kvm-unit-tests/x86/tscdeadline-latency.c === --- /dev/null 1970-01-01 00:00:00.0 + +++ kvm-unit-tests/x86/tscdeadline-latency.c 2014-12-10 18:21:38.151253344 -0200 @@ -0,0 +1,110 @@ +/* + * qemu command line | grep latency | cut -f 2 -d : latency + * + * In octave: + * load latency + * min(list) + * max(list) + * mean(list) + * hist(latency, 50) + */ + +#include libcflat.h +#include apic.h +#include vm.h +#include smp.h +#include desc.h +#include isr.h +#include msr.h + +static void test_lapic_existence(void) +{ +u32 lvr; + +lvr = apic_read(APIC_LVR); +printf(apic version: %x\n, lvr); +report(apic existence, (u16)lvr == 0x14); +} + +#define TSC_DEADLINE_TIMER_MODE (2 17) +#define TSC_DEADLINE_TIMER_VECTOR 0xef +#define MSR_IA32_TSC0x0010 +#define MSR_IA32_TSCDEADLINE0x06e0 + +static int tdt_count; +u64 exptime; +int delta; +#define TABLE_SIZE 1 +u64 table[TABLE_SIZE]; +volatile int table_idx; + +static void tsc_deadline_timer_isr(isr_regs_t *regs) +{ +u64 now = rdtsc(); +++tdt_count; + +if (table_idx TABLE_SIZE tdt_count 1) +table[table_idx++] = now - exptime; + +exptime = now+delta; +wrmsr(MSR_IA32_TSCDEADLINE, now+delta); +apic_write(APIC_EOI, 0); +} + +static void start_tsc_deadline_timer(void) +{ +handle_irq(TSC_DEADLINE_TIMER_VECTOR, tsc_deadline_timer_isr); +irq_enable(); + +wrmsr(MSR_IA32_TSCDEADLINE, rdmsr(MSR_IA32_TSC)+delta); +asm volatile (nop); +} + +static int enable_tsc_deadline_timer(void) +{ +uint32_t lvtt; + +if (cpuid(1).c (1 24)) { +lvtt = TSC_DEADLINE_TIMER_MODE | TSC_DEADLINE_TIMER_VECTOR; +apic_write(APIC_LVTT, lvtt); +start_tsc_deadline_timer(); +return 1; +} else { +return 0; +} +} + +static void test_tsc_deadline_timer(void) +{ +if(enable_tsc_deadline_timer()) { +printf(tsc deadline timer enabled\n); +} else { +printf(tsc deadline timer not detected\n); +} +} + +int main() +{ +int i; + +setup_vm(); +smp_init(); +setup_idt(); + +test_lapic_existence(); + +mask_pic_interrupts(); + +delta = 20; +test_tsc_deadline_timer(); +irq_enable(); + +do { +asm volatile(hlt); +} while (table_idx TABLE_SIZE); + +for (i = 0; i TABLE_SIZE; i++) +printf(latency: %d\n, table[i]); + +return report_summary(); +} Index: kvm-unit-tests/x86/unittests.cfg === --- kvm-unit-tests.orig/x86/unittests.cfg 2014-12-10 16:03:20.616681437 -0200 +++ kvm-unit-tests/x86/unittests.cfg 2014-12-10 16:15:23.145114609 -0200 @@ -161,3 +161,8 @@ [debug] file = debug.flat arch = x86_64 + +[tscdeadline_latency] +file = tscdeadline_latency.flat +extra_params = -cpu qemu64,+tsc-deadline +arch = x86_64 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Applied, thanks. Here is a script I use to run it: #! /bin/sh time ./x86/run x86/tscdeadline-latency.flat -cpu host | sed -n 's/^latency: //p' l.txt time ./x86/run x86/tscdeadline-latency.flat -append '200 4000' -cpu host | sed -n 's/^latency: //p' l2.txt time ./x86/run x86/tscdeadline-latency.flat -append '400 2000' -cpu host | sed -n 's/^latency: //p' l3.txt gnuplot \EOF hist(x,width)=width*floor(x/width) + binwidth/2.0 binwidth=500
Re: [patch 2/2] KVM: x86: add option to advance tscdeadline hrtimer expiration
On 10/12/2014 21:57, Marcelo Tosatti wrote: For the hrtimer which emulates the tscdeadline timer in the guest, add an option to advance expiration, and busy spin on VM-entry waiting for the actual expiration time to elapse. This allows achieving low latencies in cyclictest (or any scenario which requires strict timing regarding timer expiration). Reduces cyclictest avg latency by 50%. Note: this option requires tuning to find the appropriate value for a particular hardware/guest combination. One method is to measure the average delay between apic_timer_fn and VM-entry. Another method is to start with 1000ns, and increase the value in say 500ns increments until avg cyclictest numbers stop decreasing. What values are you using in practice for the parameter? Signed-off-by: Marcelo Tosatti mtosa...@redhat.com Index: kvm/arch/x86/kvm/lapic.c === --- kvm.orig/arch/x86/kvm/lapic.c +++ kvm/arch/x86/kvm/lapic.c @@ -33,6 +33,7 @@ #include asm/page.h #include asm/current.h #include asm/apicdef.h +#include asm/delay.h #include linux/atomic.h #include linux/jump_label.h #include kvm_cache_regs.h @@ -1073,6 +1074,7 @@ static void apic_timer_expired(struct kv { struct kvm_vcpu *vcpu = apic-vcpu; wait_queue_head_t *q = vcpu-wq; + struct kvm_timer *ktimer = apic-lapic_timer; /* * Note: KVM_REQ_PENDING_TIMER is implicitly checked in @@ -1087,11 +1089,58 @@ static void apic_timer_expired(struct kv if (waitqueue_active(q)) wake_up_interruptible(q); + + if (ktimer-timer_mode_mask == APIC_LVT_TIMER_TSCDEADLINE) + ktimer-expired_tscdeadline = ktimer-tscdeadline; +} + +static bool lapic_timer_int_injected(struct kvm_vcpu *vcpu) +{ + struct kvm_lapic *apic = vcpu-arch.apic; + u32 reg = kvm_apic_get_reg(apic, APIC_LVTT); + + if (kvm_apic_hw_enabled(apic)) { + int vec = reg APIC_VECTOR_MASK; + + if (kvm_x86_ops-test_posted_interrupt) + return kvm_x86_ops-test_posted_interrupt(vcpu, vec); + else { + if (apic_test_vector(vec, apic-regs + APIC_ISR)) + return true; + } One branch here is testing IRR, the other is testing ISR. I think testing ISR is right; on APICv, the above test will cause a busy wait during a higher-priority task (or during an interrupt service routine for the timer itself), just because the timer interrupt was delivered. So, on APICv, if the interrupt is in PIR but it has bits 7:4 = PPR[7:4], you have a problem. :( There is no APICv hook that lets you get a vmexit when the PPR becomes low enough. + } + return false; +} + +void wait_lapic_expire(struct kvm_vcpu *vcpu) +{ + struct kvm_lapic *apic = vcpu-arch.apic; + u64 guest_tsc, tsc_deadline; + + if (!kvm_vcpu_has_lapic(vcpu)) + return; + + if (!apic_lvtt_tscdeadline(apic)) + return; This test is wrong, I think. You need to check whether the timer interrupt was a TSC deadline interrupt. Instead, you are checking whether the current mode is TSC-deadline. This can be different if the interrupt could not be delivered immediately after it was received. This is easy to fix: replace the first two tests with apic-lapic_timer.expired_tscdeadline != 0 and... + if (!lapic_timer_int_injected(vcpu)) + return; + tsc_deadline = apic-lapic_timer.expired_tscdeadline; ... set apic-lapic_timer.expired_tscdeadline to 0 here. But I'm not sure how to solve the above problem with APICv. That's a pity. Knowing what values you use in practice for the parameter, would also make it easier to understand the problem. Please report that together with the graphs produced by the unit test you added. Paolo + guest_tsc = kvm_x86_ops-read_l1_tsc(vcpu, native_read_tsc()); + + while (guest_tsc tsc_deadline) { + int delay = min(tsc_deadline - guest_tsc, 1000ULL); + + ndelay(delay); + guest_tsc = kvm_x86_ops-read_l1_tsc(vcpu, native_read_tsc()); + } } static void start_apic_timer(struct kvm_lapic *apic) { ktime_t now; + atomic_set(apic-lapic_timer.pending, 0); if (apic_lvtt_period(apic) || apic_lvtt_oneshot(apic)) { @@ -1137,6 +1186,7 @@ static void start_apic_timer(struct kvm_ /* lapic timer in tsc deadline mode */ u64 guest_tsc, tscdeadline = apic-lapic_timer.tscdeadline; u64 ns = 0; + ktime_t expire; struct kvm_vcpu *vcpu = apic-vcpu; unsigned long this_tsc_khz = vcpu-arch.virtual_tsc_khz; unsigned long flags; @@ -1151,8 +1201,10 @@ static void start_apic_timer(struct kvm_ if (likely(tscdeadline guest_tsc)) { ns = (tscdeadline -
RE: [Intel-gfx] [ANNOUNCE][RFC] KVMGT - the implementation of Intel GVT-g(full GPU virtualization) for KVM
From: Paolo Bonzini [mailto:pbonz...@redhat.com] Sent: Thursday, December 11, 2014 12:59 AM On 09/12/2014 03:49, Tian, Kevin wrote: - Now we have XenGT/KVMGT separately maintained, and KVMGT lags behind XenGT regarding to features and qualities. Likely you'll continue see stale code (like Xen inst decoder) for some time. In the future we plan to maintain a single kernel repo for both, so KVMGT can share same quality as XenGT once KVM in-kernel dm framework is stable. - Regarding to Qemu hacks, KVMGT really doesn't have any different requirements as what have been discussed for GPU pass-through, e.g. about ISA bridge. Our implementation is based on an old Qemu repo, and honestly speaking not cleanly developed, because we know we can leverage from GPU pass-through support once it's in Qemu. At that time we'll leverage the same logic with minimal changes to hook KVMGT mgmt. APIs (e.g. create/destroy a vGPU instance). So we can ignore this area for now. :-) Could the virtual device model introduce new registers in order to avoid poking at the ISA bridge? I'm not sure that you can leverage from GPU pass-through support once it's in Qemu, since the Xen IGD passthrough support is being added to a separate machine that is specific to Xen IGD passthrough; no ISA bridge hacking will probably be allowed on the -M pc and -M q35 machine types. My point is that KVMGT doesn't introduce new requirements as what's required in IGD passthrough case, because all the hacks you see now is to satisfy guest graphics driver's expectation. I haven't follow up the KVM IGD passthrough progress, but if it doesn't require ISA bridge hacking the same trick can be adopted by KVMGT too. You may know Allen is working on driver changes to avoid causing those hacks in Qemu side. That effort will benefit us too. So I don't think this is a KVMGT specific issue, and we need a common solution to close this gap instead of hacking vGPU device model alone. Thanks Kevin
Re: [ANNOUNCE][RFC] KVMGT - the implementation of Intel GVT-g(full GPU virtualization) for KVM
On 11/12/2014 01:33, Tian, Kevin wrote: My point is that KVMGT doesn't introduce new requirements as what's required in IGD passthrough case, because all the hacks you see now is to satisfy guest graphics driver's expectation. I haven't follow up the KVM IGD passthrough progress, but if it doesn't require ISA bridge hacking the same trick can be adopted by KVMGT too. Right now it did require ISA bridge hacking. You may know Allen is working on driver changes to avoid causing those hacks in Qemu side. That effort will benefit us too. That's good to know, thanks! Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] kvm: coalesced_mmio: remove one redundant check inside of coalesced_mmio_in_range()
We already check 'len' above to make sure it already isn't negative here, so indeed, (addr + len addr) should never be happened. Signed-off-by: Tiejun Chen tiejun.c...@intel.com --- virt/kvm/coalesced_mmio.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/virt/kvm/coalesced_mmio.c b/virt/kvm/coalesced_mmio.c index 00d8642..60f59cd 100644 --- a/virt/kvm/coalesced_mmio.c +++ b/virt/kvm/coalesced_mmio.c @@ -30,8 +30,6 @@ static int coalesced_mmio_in_range(struct kvm_coalesced_mmio_dev *dev, */ if (len 0) return 0; - if (addr + len addr) - return 0; if (addr dev-zone.addr) return 0; if (addr + len dev-zone.addr + dev-zone.size) -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 2/2] KVM: x86: add option to advance tscdeadline hrtimer expiration
On Thu, Dec 11, 2014 at 12:37:57AM +0100, Paolo Bonzini wrote: On 10/12/2014 21:57, Marcelo Tosatti wrote: For the hrtimer which emulates the tscdeadline timer in the guest, add an option to advance expiration, and busy spin on VM-entry waiting for the actual expiration time to elapse. This allows achieving low latencies in cyclictest (or any scenario which requires strict timing regarding timer expiration). Reduces cyclictest avg latency by 50%. Note: this option requires tuning to find the appropriate value for a particular hardware/guest combination. One method is to measure the average delay between apic_timer_fn and VM-entry. Another method is to start with 1000ns, and increase the value in say 500ns increments until avg cyclictest numbers stop decreasing. What values are you using in practice for the parameter? 7us. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com Index: kvm/arch/x86/kvm/lapic.c === --- kvm.orig/arch/x86/kvm/lapic.c +++ kvm/arch/x86/kvm/lapic.c @@ -33,6 +33,7 @@ #include asm/page.h #include asm/current.h #include asm/apicdef.h +#include asm/delay.h #include linux/atomic.h #include linux/jump_label.h #include kvm_cache_regs.h @@ -1073,6 +1074,7 @@ static void apic_timer_expired(struct kv { struct kvm_vcpu *vcpu = apic-vcpu; wait_queue_head_t *q = vcpu-wq; + struct kvm_timer *ktimer = apic-lapic_timer; /* * Note: KVM_REQ_PENDING_TIMER is implicitly checked in @@ -1087,11 +1089,58 @@ static void apic_timer_expired(struct kv if (waitqueue_active(q)) wake_up_interruptible(q); + + if (ktimer-timer_mode_mask == APIC_LVT_TIMER_TSCDEADLINE) + ktimer-expired_tscdeadline = ktimer-tscdeadline; +} + +static bool lapic_timer_int_injected(struct kvm_vcpu *vcpu) +{ + struct kvm_lapic *apic = vcpu-arch.apic; + u32 reg = kvm_apic_get_reg(apic, APIC_LVTT); + + if (kvm_apic_hw_enabled(apic)) { + int vec = reg APIC_VECTOR_MASK; + + if (kvm_x86_ops-test_posted_interrupt) + return kvm_x86_ops-test_posted_interrupt(vcpu, vec); + else { + if (apic_test_vector(vec, apic-regs + APIC_ISR)) + return true; + } One branch here is testing IRR, the other is testing ISR. I think testing ISR is right; on APICv, the above test will cause a busy wait during a higher-priority task (or during an interrupt service routine for the timer itself), just because the timer interrupt was delivered. Yes. So, on APICv, if the interrupt is in PIR but it has bits 7:4 = PPR[7:4], you have a problem. :( There is no APICv hook that lets you get a vmexit when the PPR becomes low enough. Well, you simply exit earlier and busy spin for VM-exit time. For Linux guests, there is no problem. + } + return false; +} + +void wait_lapic_expire(struct kvm_vcpu *vcpu) +{ + struct kvm_lapic *apic = vcpu-arch.apic; + u64 guest_tsc, tsc_deadline; + + if (!kvm_vcpu_has_lapic(vcpu)) + return; + + if (!apic_lvtt_tscdeadline(apic)) + return; This test is wrong, I think. You need to check whether the timer interrupt was a TSC deadline interrupt. Instead, you are checking whether the current mode is TSC-deadline. This can be different if the interrupt could not be delivered immediately after it was received. This is easy to fix: replace the first two tests with apic-lapic_timer.expired_tscdeadline != 0 and... Yes. + if (!lapic_timer_int_injected(vcpu)) + return; + tsc_deadline = apic-lapic_timer.expired_tscdeadline; ... set apic-lapic_timer.expired_tscdeadline to 0 here. But I'm not sure how to solve the above problem with APICv. That's a pity. Knowing what values you use in practice for the parameter, would also make it easier to understand the problem. Please report that together with the graphs produced by the unit test you added. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kvm-unit-tests: add tscdeadline-latency test
On Wed, Dec 10, 2014 at 10:49:52PM +0100, Paolo Bonzini wrote: On 10/12/2014 21:23, Marcelo Tosatti wrote: To test latency between TSC deadline timer interrupt injection. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com Index: kvm-unit-tests/config/config-x86-common.mak === --- kvm-unit-tests.orig/config/config-x86-common.mak2014-06-27 13:43:43.694257143 -0300 +++ kvm-unit-tests/config/config-x86-common.mak 2014-12-10 16:10:41.715339378 -0200 @@ -69,6 +69,8 @@ $(TEST_DIR)/apic.elf: $(cstart.o) $(TEST_DIR)/apic.o +$(TEST_DIR)/tscdeadline-latency.elf: $(cstart.o) $(TEST_DIR)/tscdeadline-latency.o + $(TEST_DIR)/init.elf: $(cstart.o) $(TEST_DIR)/init.o $(TEST_DIR)/realmode.elf: $(TEST_DIR)/realmode.o Index: kvm-unit-tests/config/config-x86_64.mak === --- kvm-unit-tests.orig/config/config-x86_64.mak2014-12-10 16:03:20.609681443 -0200 +++ kvm-unit-tests/config/config-x86_64.mak 2014-12-10 16:10:25.172352577 -0200 @@ -9,5 +9,6 @@ $(TEST_DIR)/pcid.flat $(TEST_DIR)/debug.flat tests += $(TEST_DIR)/svm.flat tests += $(TEST_DIR)/vmx.flat +tests += $(TEST_DIR)/tscdeadline-latency.flat include config/config-x86-common.mak Index: kvm-unit-tests/x86/tscdeadline-latency.c === --- /dev/null 1970-01-01 00:00:00.0 + +++ kvm-unit-tests/x86/tscdeadline-latency.c2014-12-10 18:21:38.151253344 -0200 @@ -0,0 +1,110 @@ +/* + * qemu command line | grep latency | cut -f 2 -d : latency + * + * In octave: + * load latency + * min(list) + * max(list) + * mean(list) + * hist(latency, 50) + */ + +#include libcflat.h +#include apic.h +#include vm.h +#include smp.h +#include desc.h +#include isr.h +#include msr.h + +static void test_lapic_existence(void) +{ +u32 lvr; + +lvr = apic_read(APIC_LVR); +printf(apic version: %x\n, lvr); +report(apic existence, (u16)lvr == 0x14); +} + +#define TSC_DEADLINE_TIMER_MODE (2 17) +#define TSC_DEADLINE_TIMER_VECTOR 0xef +#define MSR_IA32_TSC0x0010 +#define MSR_IA32_TSCDEADLINE0x06e0 + +static int tdt_count; +u64 exptime; +int delta; +#define TABLE_SIZE 1 +u64 table[TABLE_SIZE]; +volatile int table_idx; + +static void tsc_deadline_timer_isr(isr_regs_t *regs) +{ +u64 now = rdtsc(); +++tdt_count; + +if (table_idx TABLE_SIZE tdt_count 1) +table[table_idx++] = now - exptime; + +exptime = now+delta; +wrmsr(MSR_IA32_TSCDEADLINE, now+delta); +apic_write(APIC_EOI, 0); +} + +static void start_tsc_deadline_timer(void) +{ +handle_irq(TSC_DEADLINE_TIMER_VECTOR, tsc_deadline_timer_isr); +irq_enable(); + +wrmsr(MSR_IA32_TSCDEADLINE, rdmsr(MSR_IA32_TSC)+delta); +asm volatile (nop); +} + +static int enable_tsc_deadline_timer(void) +{ +uint32_t lvtt; + +if (cpuid(1).c (1 24)) { +lvtt = TSC_DEADLINE_TIMER_MODE | TSC_DEADLINE_TIMER_VECTOR; +apic_write(APIC_LVTT, lvtt); +start_tsc_deadline_timer(); +return 1; +} else { +return 0; +} +} + +static void test_tsc_deadline_timer(void) +{ +if(enable_tsc_deadline_timer()) { +printf(tsc deadline timer enabled\n); +} else { +printf(tsc deadline timer not detected\n); +} +} + +int main() +{ +int i; + +setup_vm(); +smp_init(); +setup_idt(); + +test_lapic_existence(); + +mask_pic_interrupts(); + +delta = 20; +test_tsc_deadline_timer(); +irq_enable(); + +do { +asm volatile(hlt); +} while (table_idx TABLE_SIZE); + +for (i = 0; i TABLE_SIZE; i++) +printf(latency: %d\n, table[i]); + +return report_summary(); +} Index: kvm-unit-tests/x86/unittests.cfg === --- kvm-unit-tests.orig/x86/unittests.cfg 2014-12-10 16:03:20.616681437 -0200 +++ kvm-unit-tests/x86/unittests.cfg2014-12-10 16:15:23.145114609 -0200 @@ -161,3 +161,8 @@ [debug] file = debug.flat arch = x86_64 + +[tscdeadline_latency] +file = tscdeadline_latency.flat +extra_params = -cpu qemu64,+tsc-deadline +arch = x86_64 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Applied, thanks. Here is a script I use to run it: #! /bin/sh time ./x86/run x86/tscdeadline-latency.flat -cpu host | sed -n 's/^latency: //p' l.txt time ./x86/run x86/tscdeadline-latency.flat
[PATCH v3 1/3] KVM: nVMX: Add nested msr load/restore algorithm
Several hypervisors need MSR auto load/restore feature. We read MSRs from VM-entry MSR load area which specified by L1, and load them via kvm_set_msr in the nested entry. When nested exit occurs, we get MSRs via kvm_get_msr, writing them to L1`s MSR store area. After this, we read MSRs from VM-exit MSR load area, and load them via kvm_set_msr. Signed-off-by: Wincy Van fanwenyi0...@gmail.com --- arch/x86/include/uapi/asm/vmx.h | 5 +++ arch/x86/kvm/vmx.c | 68 + arch/x86/kvm/x86.c | 1 + virt/kvm/kvm_main.c | 1 + 4 files changed, 75 insertions(+) diff --git a/arch/x86/include/uapi/asm/vmx.h b/arch/x86/include/uapi/asm/vmx.h index b813bf9..ff2b8e2 100644 --- a/arch/x86/include/uapi/asm/vmx.h +++ b/arch/x86/include/uapi/asm/vmx.h @@ -56,6 +56,7 @@ #define EXIT_REASON_MSR_READ31 #define EXIT_REASON_MSR_WRITE 32 #define EXIT_REASON_INVALID_STATE 33 +#define EXIT_REASON_MSR_LOAD_FAIL 34 #define EXIT_REASON_MWAIT_INSTRUCTION 36 #define EXIT_REASON_MONITOR_INSTRUCTION 39 #define EXIT_REASON_PAUSE_INSTRUCTION 40 @@ -116,10 +117,14 @@ { EXIT_REASON_APIC_WRITE,APIC_WRITE }, \ { EXIT_REASON_EOI_INDUCED, EOI_INDUCED }, \ { EXIT_REASON_INVALID_STATE, INVALID_STATE }, \ + { EXIT_REASON_MSR_LOAD_FAIL, MSR_LOAD_FAIL }, \ { EXIT_REASON_INVD, INVD }, \ { EXIT_REASON_INVVPID, INVVPID }, \ { EXIT_REASON_INVPCID, INVPCID }, \ { EXIT_REASON_XSAVES,XSAVES }, \ { EXIT_REASON_XRSTORS, XRSTORS } +#define VMX_ABORT_SAVE_GUEST_MSR_FAIL1 +#define VMX_ABORT_LOAD_HOST_MSR_FAIL 4 + #endif /* _UAPIVMX_H */ diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 9bcc871..b49d198 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -6143,6 +6143,13 @@ static void nested_vmx_failValid(struct kvm_vcpu *vcpu, */ } +static void nested_vmx_abort(struct kvm_vcpu *vcpu, u32 indicator) +{ + /* TODO: not to reset guest simply here. */ + kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu); + pr_warn(kvm: nested vmx abort, indicator %d\n, indicator); +} + static enum hrtimer_restart vmx_preemption_timer_fn(struct hrtimer *timer) { struct vcpu_vmx *vmx = @@ -8286,6 +8293,67 @@ static void vmx_start_preemption_timer(struct kvm_vcpu *vcpu) ns_to_ktime(preemption_timeout), HRTIMER_MODE_REL); } +static inline int nested_vmx_msr_check_common(struct vmx_msr_entry *e) +{ + if (e-index 8 == 0x8 || e-reserved != 0) + return -EINVAL; + return 0; +} + +static inline int nested_vmx_load_msr_check(struct vmx_msr_entry *e) +{ + if (e-index == MSR_FS_BASE || + e-index == MSR_GS_BASE || + nested_vmx_msr_check_common(e)) + return -EINVAL; + return 0; +} + +/* + * Load guest's/host's msr at nested entry/exit. + * return 0 for success, entry index for failure. + */ +static u32 nested_vmx_load_msr(struct kvm_vcpu *vcpu, u64 gpa, u32 count) +{ + u32 i; + struct vmx_msr_entry e; + struct msr_data msr; + + msr.host_initiated = false; + for (i = 0; i count; i++) { + kvm_read_guest(vcpu-kvm, gpa + i * sizeof(e), e, sizeof(e)); + if (nested_vmx_load_msr_check(e)) + goto fail; + msr.index = e.index; + msr.data = e.value; + if (kvm_set_msr(vcpu, msr)) + goto fail; + } + return 0; +fail: + return i + 1; +} + +static int nested_vmx_store_msr(struct kvm_vcpu *vcpu, u64 gpa, u32 count) +{ + u32 i; + struct vmx_msr_entry e; + + for (i = 0; i count; i++) { + kvm_read_guest(vcpu-kvm, gpa + i * sizeof(e), + e, 2 * sizeof(u32)); + if (nested_vmx_msr_check_common(e)) + return -EINVAL; + if (kvm_get_msr(vcpu, e.index, e.value)) + return -EINVAL; + kvm_write_guest(vcpu-kvm, + gpa + i * sizeof(e) + + offsetof(struct vmx_msr_entry, value), + e.value, sizeof(e.value)); + } + return 0; +} + /* * prepare_vmcs02 is called when the L1 guest hypervisor runs its nested * L2 guest. L1 has a vmcs for L2 (vmcs12), and this function merges it diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index c259814..af9faed 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2324,6 +2324,7 @@ int kvm_get_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata) { return kvm_x86_ops-get_msr(vcpu, msr_index, pdata); } +EXPORT_SYMBOL_GPL(kvm_get_msr); static int get_msr_mtrr(struct kvm_vcpu *vcpu, u32 msr,
[PATCH v3 2/3] KVM: nVMX: Improve nested msr switch checking
This patch improve checks required by Intel Software Developer Manual. - SMM MSRs are not allowed. - microcode MSRs are not allowed. - check x2apic MSRs only when LAPIC is in x2apic mode. - MSR switch areas must be aligned to 16 bytes. - address of first and last byte in MSR switch areas should not set any bits beyond the processor's physical-address width. Also it adds warning messages on failures during MSR switch. These messages are useful for people who debug their VMMs in nVMX. Signed-off-by: Eugene Korenevsky ekorenev...@gmail.com --- arch/x86/include/uapi/asm/msr-index.h | 3 + arch/x86/kvm/vmx.c| 121 ++ 2 files changed, 110 insertions(+), 14 deletions(-) diff --git a/arch/x86/include/uapi/asm/msr-index.h b/arch/x86/include/uapi/asm/msr-index.h index e21331c..3c9c601 100644 --- a/arch/x86/include/uapi/asm/msr-index.h +++ b/arch/x86/include/uapi/asm/msr-index.h @@ -316,6 +316,9 @@ #define MSR_IA32_UCODE_WRITE 0x0079 #define MSR_IA32_UCODE_REV 0x008b +#define MSR_IA32_SMM_MONITOR_CTL 0x009b +#define MSR_IA32_SMBASE0x009e + #define MSR_IA32_PERF_STATUS 0x0198 #define MSR_IA32_PERF_CTL 0x0199 #define MSR_AMD_PSTATE_DEF_BASE0xc0010064 diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index b49d198..9061d93 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -8293,18 +8293,78 @@ static void vmx_start_preemption_timer(struct kvm_vcpu *vcpu) ns_to_ktime(preemption_timeout), HRTIMER_MODE_REL); } -static inline int nested_vmx_msr_check_common(struct vmx_msr_entry *e) +static int nested_vmx_check_msr_switch(struct kvm_vcpu *vcpu, + unsigned long count_field, + unsigned long addr_field, + int maxphyaddr) { - if (e-index 8 == 0x8 || e-reserved != 0) + u64 count, addr; + + if (vmcs12_read_any(vcpu, count_field, count) || + vmcs12_read_any(vcpu, addr_field, addr)) { + WARN_ON(1); return -EINVAL; + } + if (!IS_ALIGNED(addr, 16) || addr maxphyaddr || + (addr + count * sizeof(struct vmx_msr_entry) - 1) maxphyaddr) { + pr_warn_ratelimited( + nVMX: invalid MSR switch (0x%lx, %d, %llu, 0x%08llx), + addr_field, maxphyaddr, count, addr); + return -EINVAL; + } return 0; } -static inline int nested_vmx_load_msr_check(struct vmx_msr_entry *e) +static int nested_vmx_check_msr_switch_controls(struct kvm_vcpu *vcpu, + struct vmcs12 *vmcs12) +{ + int maxphyaddr; + + if (vmcs12-vm_exit_msr_load_count == 0 + vmcs12-vm_exit_msr_store_count == 0 + vmcs12-vm_entry_msr_load_count == 0) + return 0; /* Fast path */ + maxphyaddr = cpuid_maxphyaddr(vcpu); + if (nested_vmx_check_msr_switch(vcpu, VM_EXIT_MSR_LOAD_COUNT, + VM_EXIT_MSR_LOAD_ADDR, maxphyaddr) || + nested_vmx_check_msr_switch(vcpu, VM_EXIT_MSR_STORE_COUNT, + VM_EXIT_MSR_STORE_ADDR, maxphyaddr) || + nested_vmx_check_msr_switch(vcpu, VM_ENTRY_MSR_LOAD_COUNT, + VM_ENTRY_MSR_LOAD_ADDR, maxphyaddr)) + return -EINVAL; + return 0; +} + +static int nested_vmx_msr_check_common(struct kvm_vcpu *vcpu, + struct vmx_msr_entry *e) +{ + /* x2APIC MSR accesses are not allowed */ + if (apic_x2apic_mode(vcpu-arch.apic) e-index 8 == 0x8) + return -EINVAL; + if (e-index == MSR_IA32_UCODE_WRITE || /* SDM Table 35-2 */ + e-index == MSR_IA32_UCODE_REV) + return -EINVAL; + if (e-reserved != 0) + return -EINVAL; + return 0; +} + +static int nested_vmx_load_msr_check(struct kvm_vcpu *vcpu, +struct vmx_msr_entry *e) { if (e-index == MSR_FS_BASE || e-index == MSR_GS_BASE || - nested_vmx_msr_check_common(e)) + e-index == MSR_IA32_SMM_MONITOR_CTL || /* SMM is not supported */ + nested_vmx_msr_check_common(vcpu, e)) + return -EINVAL; + return 0; +} + +static int nested_vmx_store_msr_check(struct kvm_vcpu *vcpu, + struct vmx_msr_entry *e) +{ + if (e-index == MSR_IA32_SMBASE || /* SMM is not supported */ + nested_vmx_msr_check_common(vcpu, e)) return -EINVAL; return 0; } @@ -8321,13 +8381,27 @@ static u32 nested_vmx_load_msr(struct kvm_vcpu *vcpu, u64 gpa, u32 count) msr.host_initiated = false; for (i = 0; i count; i++) { -
[PATCH v3 3/3] KVM: nVMX: Enable nested msr load/restore feature
On nested entry: - check msr switch area. - load L2's MSRs. If failed, terminate nested entry and load L1's state. If failed on loading L1's MSRs again, do nested vmx abort. On nested exit: - restore L2's MSRs. If failed, do nested vmx abort. - load L1's MSRs. If failed, do nested vmx abort. Signed-off-by: Wincy Van fanwenyi0...@gmail.com Signed-off-by: Eugene Korenevsky ekorenev...@gmail.com --- arch/x86/kvm/vmx.c | 30 +++--- 1 file changed, 23 insertions(+), 7 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 9061d93..0d4efaa 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -8743,6 +8743,7 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch) int cpu; struct loaded_vmcs *vmcs02; bool ia32e; + u32 msr_entry_idx; if (!nested_vmx_check_permission(vcpu) || !nested_vmx_check_vmcs12(vcpu)) @@ -8790,11 +8791,7 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch) return 1; } - if (vmcs12-vm_entry_msr_load_count 0 || - vmcs12-vm_exit_msr_load_count 0 || - vmcs12-vm_exit_msr_store_count 0) { - pr_warn_ratelimited(%s: VMCS MSR_{LOAD,STORE} unsupported\n, - __func__); + if (nested_vmx_check_msr_switch_controls(vcpu, vmcs12)) { nested_vmx_failValid(vcpu, VMXERR_ENTRY_INVALID_CONTROL_FIELD); return 1; } @@ -8900,10 +8897,21 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch) vmx_segment_cache_clear(vmx); - vmcs12-launch_state = 1; - prepare_vmcs02(vcpu, vmcs12); + msr_entry_idx = nested_vmx_load_msr(vcpu, + vmcs12-vm_entry_msr_load_addr, + vmcs12-vm_entry_msr_load_count); + if (msr_entry_idx) { + leave_guest_mode(vcpu); + vmx_load_vmcs01(vcpu); + nested_vmx_entry_failure(vcpu, vmcs12, + EXIT_REASON_MSR_LOAD_FAIL, msr_entry_idx); + return 1; + } + + vmcs12-launch_state = 1; + if (vmcs12-guest_activity_state == GUEST_ACTIVITY_HLT) return kvm_emulate_halt(vcpu); @@ -9333,6 +9341,10 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu, kvm_set_dr(vcpu, 7, 0x400); vmcs_write64(GUEST_IA32_DEBUGCTL, 0); + + if (nested_vmx_load_msr(vcpu, vmcs12-vm_exit_msr_load_addr, + vmcs12-vm_exit_msr_load_count)) + nested_vmx_abort(vcpu, VMX_ABORT_LOAD_HOST_MSR_FAIL); } /* @@ -9354,6 +9366,10 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason, prepare_vmcs12(vcpu, vmcs12, exit_reason, exit_intr_info, exit_qualification); + if (nested_vmx_store_msr(vcpu, vmcs12-vm_exit_msr_store_addr, +vmcs12-vm_exit_msr_store_count)) + nested_vmx_abort(vcpu, VMX_ABORT_SAVE_GUEST_MSR_FAIL); + vmx_load_vmcs01(vcpu); if ((exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT) -- 2.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [v2 17/25] KVM: kvm-vfio: User API for VT-d Posted-Interrupts
-Original Message- From: Alex Williamson [mailto:alex.william...@redhat.com] Sent: Monday, December 08, 2014 1:21 PM To: Wu, Feng Cc: Eric Auger; t...@linutronix.de; mi...@redhat.com; h...@zytor.com; x...@kernel.org; g...@kernel.org; pbonz...@redhat.com; dw...@infradead.org; j...@8bytes.org; jiang@linux.intel.com; linux-ker...@vger.kernel.org; io...@lists.linux-foundation.org; kvm@vger.kernel.org Subject: Re: [v2 17/25] KVM: kvm-vfio: User API for VT-d Posted-Interrupts On Mon, 2014-12-08 at 04:58 +, Wu, Feng wrote: -Original Message- From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On Behalf Of Eric Auger Sent: Thursday, December 04, 2014 10:05 PM To: Wu, Feng; t...@linutronix.de; mi...@redhat.com; h...@zytor.com; x...@kernel.org; g...@kernel.org; pbonz...@redhat.com; dw...@infradead.org; j...@8bytes.org; alex.william...@redhat.com; jiang@linux.intel.com Cc: linux-ker...@vger.kernel.org; io...@lists.linux-foundation.org; kvm@vger.kernel.org Subject: Re: [v2 17/25] KVM: kvm-vfio: User API for VT-d Posted-Interrupts Hi Feng, On 12/03/2014 08:39 AM, Feng Wu wrote: This patch adds and documents a new attribute KVM_DEV_VFIO_DEVICE_POSTING_IRQ in KVM_DEV_VFIO_DEVICE group. This new attribute is used for VT-d Posted-Interrupts. When guest OS changes the interrupt configuration for an assigned device, such as, MSI/MSIx data/address fields, QEMU will use this IRQ attribute to tell KVM to update the related IRTE according the VT-d Posted-Interrrupts Specification, such as, the guest vector should be updated in the related IRTE. Signed-off-by: Feng Wu feng...@intel.com --- Documentation/virtual/kvm/devices/vfio.txt |9 + include/uapi/linux/kvm.h | 10 ++ 2 files changed, 19 insertions(+), 0 deletions(-) diff --git a/Documentation/virtual/kvm/devices/vfio.txt b/Documentation/virtual/kvm/devices/vfio.txt index f7aff29..41e12b7 100644 --- a/Documentation/virtual/kvm/devices/vfio.txt +++ b/Documentation/virtual/kvm/devices/vfio.txt @@ -42,3 +42,12 @@ activated before VFIO_DEVICE_SET_IRQS has been called to trigger the IRQ or associate an eventfd to it. Unforwarding can only be called while the signaling has been disabled with VFIO_DEVICE_SET_IRQS. If this condition is not satisfied, the command returns an -EBUSY. + + KVM_DEV_VFIO_DEVICE_POSTING_IRQ: Use posted interrtups mechanism to post typo + the IRQ to guests. +For this attribute, kvm_device_attr.addr points to a kvm_vfio_dev_irq struct. + +When guest OS changes the interrupt configuration for an assigned device, +such as, MSI/MSIx data/address fields, QEMU will use this IRQ attribute +to tell KVM to update the related IRTE according the VT-d Posted-Interrrupts +Specification, such as, the guest vector should be updated in the related IRTE. For my curiosity are there any restrictions about the instant at which the change can be done? I do not get here how you deactivate the posting? The current method is if the hardware supports interrupts posting, we will use it instead of interrupts remapping, since it has good performance. Why do I need deactivate interrupts posting? Here is the reply to Alex for the same question: In fact, I don't think we need to stop the posted-interrupts. For setting posted interrupts, we update the related IRTE according to the new format. If the guest reboots, or unload the drivers, or some other operations, the msi/msix will be disabled first, in this path, the irq will be disabled the related IRTE is not used anymore. Right, and I'm still not sure I agree with that reasoning. We need to build the kernel interface to be generic, not tailored for a specific userspace. I don't really feel comfortable having something that can't be disabled via a similar path to it being enabled. For instance, what about a dynamic debug interface where we want to enable tracing and see each interrupt injected into the guest. At that point we'd want to disabled posted interrupts and direct KVM injection and route via QEMU. Thanks, Alex I am not quite understand why we need to debug the software delivery path for interrupt when PI is used, in this case, the software injection code will have no chance to execute. If we don't want the use PI, we can disable it from kernel command line. Thanks, Feng diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index a269a42..7d98650 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -949,6 +949,7 @@ struct kvm_device_attr { #define KVM_DEV_VFIO_DEVICE 2 #define KVM_DEV_VFIO_DEVICE_FORWARD_IRQ 1 #define
Fix Penguin Penalty 17th October2014 ( mail-archive.com )
Dear Sir Did your website get hit by Google Penguin update on October 17th 2014? What basically is Google Penguin Update? It is actually a code name for Google algorithm which aims at decreasing your websites search engine rankings that violate Googles guidelines by using black hat SEO techniques to rank your webpage by giving number of spammy links to the page. We are one of those few SEO companies that can help you avoid penalties from Google Updates like Penguin and Panda. Our clients have survived all the previous and present updates with ease. They have never been hit because we use 100% white hat SEO techniques to rank Webpages. Simple thing that we do to keep websites away from any Penguin or Panda penalties is follow Google guidelines and we give Google users the best answers to their queries. If you are looking to increase the quality of your websites and to get more targeted traffic or save your websites from these Google penalties email us back with your interest. We will be glad to serve you and help you grow your business. Regards Arohi Singh SEO Manager ( TOB ) B7 Green Avenue, Amritsar 143001 Punjab NO CLICK in the subject to STOP EMAILS -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html