Got bash 'Resource temporarily unavailable' when testing linux_s3 (kvm)
Anybody can help on linux_s3 (kvm autotest)? When running the testcase linux_s3 to test kvm, we constantly got 'bash: echo: write error: Resource temporarily unavailable'. This testcase can be found at autotest/client/tests/kvm/tests/linux_s3.py (or autotest/client/tests/kvm/kvm_tests.py in the old version). The testing command is: 'chvt %s echo mem /sys/power/state chvt %s' % (dst_tty, src_tty) e.g. 'chvt 1 echo mem /sys/power/state chvt 7' Is there any chance that the 'echo' command is executed before 'chvt 1' took full effect? (Just my wild guess.) I'll appreciate your help. Regards, -- Amos Kong Quality Engineer Raycom Office(Beijing), Red Hat Inc. Phone: +86-10-62608183 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
avoid soft lockups
Hello, If I suspend and resume a guest under kvm, or I migrate it from one host to another, I sometimes get a soft lockup. Is there a timer mode to prevent or reduce the likelihood of these? Thanks JB -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: avoid soft lockups
On Fri, Oct 02, 2009 at 01:54:22PM +0200, James Brackinshaw wrote: Hello, If I suspend and resume a guest under kvm, or I migrate it from one host to another, I sometimes get a soft lockup. Is there a timer mode to prevent or reduce the likelihood of these? Not yet. For now you can either disable softlockup in the guest, or live with the spurious warnings. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Got bash 'Resource temporarily unavailable' when testing linux_s3 (kvm)
On Fri, Oct 02, 2009 at 05:44:35PM +0800, Amos Kong wrote: Anybody can help on linux_s3 (kvm autotest)? When running the testcase linux_s3 to test kvm, we constantly got 'bash: echo: write error: Resource temporarily unavailable'. This testcase can be found at autotest/client/tests/kvm/tests/linux_s3.py (or autotest/client/tests/kvm/kvm_tests.py in the old version). The testing command is: 'chvt %s echo mem /sys/power/state chvt %s' % (dst_tty, src_tty) e.g. 'chvt 1 echo mem /sys/power/state chvt 7' Is there any chance that the 'echo' command is executed before 'chvt 1' took full effect? (Just my wild guess.) I'll appreciate your help. There was a bug in virtio-balloon, fixed in 2.6.30, that prevents proper suspend-to-RAM. Can you share dmesg output after the failure? -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [RESEND] KVM:VMX: Add support for Pause-Loop Exiting
On Wed, Sep 30, 2009 at 01:22:49PM -0300, Marcelo Tosatti wrote: On Wed, Sep 30, 2009 at 09:01:51AM +0800, Zhai, Edwin wrote: Avi, I modify it according your comments. The only thing I want to keep is the module param ple_gap/window. Although they are not per-guest, they can be used to find the right value, and disable PLE for debug purpose. Thanks, Avi Kivity wrote: On 09/28/2009 11:33 AM, Zhai, Edwin wrote: Avi Kivity wrote: +#define KVM_VMX_DEFAULT_PLE_GAP41 +#define KVM_VMX_DEFAULT_PLE_WINDOW 4096 +static int __read_mostly ple_gap = KVM_VMX_DEFAULT_PLE_GAP; +module_param(ple_gap, int, S_IRUGO); + +static int __read_mostly ple_window = KVM_VMX_DEFAULT_PLE_WINDOW; +module_param(ple_window, int, S_IRUGO); Shouldn't be __read_mostly since they're read very rarely (__read_mostly should be for variables that are very often read, and rarely written). In general, they are read only except that experienced user may try different parameter for perf tuning. __read_mostly doesn't just mean it's read mostly. It also means it's read often. Otherwise it's just wasting space in hot cachelines. I'm not even sure they should be parameters. For different spinlock in different OS, and for different workloads, we need different parameter for tuning. It's similar as the enable_ept. No, global parameters don't work for tuning workloads and guests since they cannot be modified on a per-guest basis. enable_ept is only useful for debugging and testing. +set_current_state(TASK_INTERRUPTIBLE); +schedule_hrtimeout(expires, HRTIMER_MODE_ABS); + Please add a tracepoint for this (since it can cause significant change in behaviour), Isn't trace_kvm_exit(exit_reason, ...) enough? We can tell the PLE vmexit from other vmexits. Right. I thought of the software spinlock detector, but that's another problem. I think you can drop the sleep_time parameter, it can be part of the function. Also kvm_vcpu_sleep() is confusing, we also sleep on halt. Please call it kvm_vcpu_on_spin() or something (since that's what the guest is doing). kvm_vcpu_on_spin() should add the vcpu to vcpu-wq (so a new pending interrupt wakes it up immediately). Updated version (also please send it separately from the vmx.c patch): diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 894a56e..43125dc 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -231,6 +231,7 @@ int kvm_is_visible_gfn(struct kvm *kvm, gfn_t gfn); void mark_page_dirty(struct kvm *kvm, gfn_t gfn); void kvm_vcpu_block(struct kvm_vcpu *vcpu); +void kvm_vcpu_on_spin(struct kvm_vcpu *vcpu); void kvm_resched(struct kvm_vcpu *vcpu); void kvm_load_guest_fpu(struct kvm_vcpu *vcpu); void kvm_put_guest_fpu(struct kvm_vcpu *vcpu); diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 4d0dd39..e788d70 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1479,6 +1479,21 @@ void kvm_resched(struct kvm_vcpu *vcpu) } EXPORT_SYMBOL_GPL(kvm_resched); +void kvm_vcpu_on_spin(struct kvm_vcpu *vcpu) +{ + ktime_t expires; + DEFINE_WAIT(wait); + + prepare_to_wait(vcpu-wq, wait, TASK_INTERRUPTIBLE); + + /* Sleep for 100 us, and hope lock-holder got scheduled */ + expires = ktime_add_ns(ktime_get(), 10UL); + schedule_hrtimeout(expires, HRTIMER_MODE_ABS); + + finish_wait(vcpu-wq, wait); +} +EXPORT_SYMBOL_GPL(kvm_vcpu_on_spin); + static int kvm_vcpu_fault(struct vm_area_struct *vma, struct vm_fault *vmf) { struct kvm_vcpu *vcpu = vma-vm_file-private_data; -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: INFO: task journal:337 blocked for more than 120 seconds
On 09/30/09 14:11, Shirley Ma wrote: Anybody found this problem before? I kept hitting this issue for 2.6.31 guest kernel even with a simple network test. INFO: task kjournal:337 blocked for more than 120 seconds. echo 0 /proc/sys/kernel/hung_task_timeout_sec disables this message. kjournald D 0041 0 337 2 0x My test is totally being blocked. I'm assuming from the lists you've posted to that this is under KVM? What disk drivers are you using (virtio or emulated)? Can you get a full stack backtrace of kjournald? Kevin Bowling submitted a RH bug against Xen with apparently the same symptoms (https://bugzilla.redhat.com/show_bug.cgi?id=526627). I'm wondering if there's a core kernel bug here, which is perhaps more easily triggered by the changed timing in a virtual machine. Thanks, J -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: INFO: task journal:337 blocked for more than 120 seconds
On 10/02/09 12:06, Shirley Ma wrote: On Fri, 2009-10-02 at 11:30 -0700, Jeremy Fitzhardinge wrote: I'm assuming from the lists you've posted to that this is under KVM? What disk drivers are you using (virtio or emulated)? Can you get a full stack backtrace of kjournald? Yes, it's under KVM, disk driver is virtio. Since the io has issue, the stack can't be saved on the disk. I have the image file attached here. Ah, thank you. The backtrace does indeed look very similar. (BTW, you could get a serial console with qemu-kvm -nographic -append console=ttyS0 ...) J -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH][retry 2] Support Pause Filter in AMD processors
From 66741f741da741e58e8162ef7809dd7d6f8e01cf Mon Sep 17 00:00:00 2001 From: Mark Langsdorf mark.langsd...@amd.com Date: Fri, 2 Oct 2009 10:32:33 -0500 Subject: [PATCH] Support Pause Filter in AMD processors New AMD processors (Family 0x10 models 8+) support the Pause Filter Feature. This feature creates a new field in the VMCB called Pause Filter Count. If Pause Filter Count is greater than 0 and intercepting PAUSEs is enabled, the processor will increment an internal counter when a PAUSE instruction occurs instead of intercepting. When the internal counter reaches the Pause Filter Count value, a PAUSE intercept will occur. This feature can be used to detect contended spinlocks, especially when the lock holding VCPU is not scheduled. Rescheduling another VCPU prevents the VCPU seeking the lock from wasting its quantum by spinning idly. Experimental results show that most spinlocks are held for less than 1000 PAUSE cycles or more than a few thousand. Default the Pause Filter Counter to 3000 to detect the contended spinlocks. Processor support for this feature is indicated by a CPUID bit. On a 24 core system running 4 guests each with 16 VCPUs, this patch improved overall performance of each guest's 32 job kernbench by approximately 3-5% when combined with a scheduler algorithm thati caused the VCPU to sleep for a brief period. Further performance improvement may be possible with a more sophisticated yield algorithm. This patch depends on the changes to the kvm code from KVM:VMX: Add support for Pause Loop Exiting http://www.mail-archive.com/kvm@vger.kernel.org/msg23089.html -Mark Langsdorf Operating System Research Center AMD --- arch/x86/include/asm/svm.h |3 ++- arch/x86/kvm/svm.c | 16 2 files changed, 18 insertions(+), 1 deletions(-) diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h index 85574b7..1fecb7e 100644 --- a/arch/x86/include/asm/svm.h +++ b/arch/x86/include/asm/svm.h @@ -57,7 +57,8 @@ struct __attribute__ ((__packed__)) vmcb_control_area { u16 intercept_dr_write; u32 intercept_exceptions; u64 intercept; - u8 reserved_1[44]; + u8 reserved_1[42]; + u16 pause_filter_count; u64 iopm_base_pa; u64 msrpm_base_pa; u64 tsc_offset; diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 9a4daca..d5d2e03 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -46,6 +46,7 @@ MODULE_LICENSE(GPL); #define SVM_FEATURE_NPT (1 0) #define SVM_FEATURE_LBRV (1 1) #define SVM_FEATURE_SVML (1 2) +#define SVM_FEATURE_PAUSE_FILTER (1 10) #define NESTED_EXIT_HOST 0 /* Exit handled on host level */ #define NESTED_EXIT_DONE 1 /* Exit caused nested vmexit */ @@ -659,6 +660,11 @@ static void init_vmcb(struct vcpu_svm *svm) svm-nested.vmcb = 0; svm-vcpu.arch.hflags = 0; + if (svm_has(SVM_FEATURE_PAUSE_FILTER)) { + control-pause_filter_count = 3000; + control-intercept |= (1ULL INTERCEPT_PAUSE); + } + enable_gif(svm); } @@ -2270,6 +2276,15 @@ static int interrupt_window_interception(struct vcpu_svm *svm) return 1; } +static int pause_interception(struct vcpu_svm *svm) +{ + static int pause_count = 0; + + kvm_vcpu_on_spin((svm-vcpu)); +printk(KERN_ERR MJLL pause intercepted %d\n, ++pause_count); + return 1; +} + static int (*svm_exit_handlers[])(struct vcpu_svm *svm) = { [SVM_EXIT_READ_CR0] = emulate_on_interception, [SVM_EXIT_READ_CR3] = emulate_on_interception, @@ -2305,6 +2320,7 @@ static int (*svm_exit_handlers[])(struct vcpu_svm *svm) = { [SVM_EXIT_CPUID]= cpuid_interception, [SVM_EXIT_IRET] = iret_interception, [SVM_EXIT_INVD] = emulate_on_interception, + [SVM_EXIT_PAUSE]= pause_interception, [SVM_EXIT_HLT] = halt_interception, [SVM_EXIT_INVLPG] = invlpg_interception, [SVM_EXIT_INVLPGA] = invlpga_interception, -- 1.6.0.2 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
docs on storage pools?
So far I've been using files and/or LVM partitions for my VMs -- basically by using virt-manager and modifying existing XML configs and just copying my VM files to be reused. I'm wondering how KVM storage pools work -- at first I thought it was something like KVM's version of LVM where you can just dump all your VMs in one space .. .but it looks like it's really means different places you want to store your VMs: - dir: Filesystem Directory - disk: Physical Disk Device - fs: Pre-Formatted Block Device - iscsi: iSCSI Target -logical: LVM Volume Group - netfs: Network exported directory I understand things like LVM and storing VMs in a filesystem directory.. but what real difference is there by going through the GUI? I suppose nothing. Maybe I'm overthinking this -- it's just a frontend to where you store your VMs? -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 0/4] KVM: xinterface
(Applies to kvm.git/master:083e9e10) For details, please read the patch headers. [ Changelog: v2: *) We now re-use the vmfd as the binding token, instead of creating a new separate namespace *) Added support for switch_to(mm), which is much faster *) Added support for memslot-cache for exploiting slot locality *) Added support for scatter-gather access *) Added support for xioevent interface v1: *) Initial release ] This series is included in upstream AlacrityVM and is well tested and known to work properly. Comments? Kind Regards, -Greg --- Gregory Haskins (4): KVM: add scatterlist support to xinterface KVM: add io services to xinterface KVM: introduce xinterface API for external interaction with guests mm: export use_mm() and unuse_mm() to modules arch/x86/kvm/Makefile |2 include/linux/kvm_host.h |3 include/linux/kvm_xinterface.h | 165 +++ kernel/fork.c |1 mm/mmu_context.c |3 virt/kvm/kvm_main.c| 24 ++ virt/kvm/xinterface.c | 587 7 files changed, 784 insertions(+), 1 deletions(-) create mode 100644 include/linux/kvm_xinterface.h create mode 100644 virt/kvm/xinterface.c -- Signature -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 1/4] mm: export use_mm() and unuse_mm() to modules
We want to use these functions from withing KVM, which may be built as a module. Signed-off-by: Gregory Haskins ghask...@novell.com --- mm/mmu_context.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/mm/mmu_context.c b/mm/mmu_context.c index ded9081..f31ba20 100644 --- a/mm/mmu_context.c +++ b/mm/mmu_context.c @@ -6,6 +6,7 @@ #include linux/mm.h #include linux/mmu_context.h #include linux/sched.h +#include linux/module.h #include asm/mmu_context.h @@ -37,6 +38,7 @@ void use_mm(struct mm_struct *mm) if (active_mm != mm) mmdrop(active_mm); } +EXPORT_SYMBOL_GPL(use_mm); /* * unuse_mm @@ -56,3 +58,4 @@ void unuse_mm(struct mm_struct *mm) enter_lazy_tlb(mm, tsk); task_unlock(tsk); } +EXPORT_SYMBOL_GPL(unuse_mm); -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 2/4] KVM: introduce xinterface API for external interaction with guests
What: xinterface is a mechanism that allows kernel modules external to the kvm.ko proper to interface with a running guest. It accomplishes this by creating an abstracted interface which does not expose any private details of the guest or its related KVM structures, and provides a mechanism to find and bind to this interface at run-time. Why: There are various subsystems that would like to interact with a KVM guest which are ideally suited to exist outside the domain of the kvm.ko core logic. For instance, external pci-passthrough, virtual-bus, and virtio-net modules are currently under development. In order for these modules to successfully interact with the guest, they need, at the very least, various interfaces for signaling IO events, pointer translation, and possibly memory mapping. The signaling case is covered by the recent introduction of the irqfd/ioeventfd mechanisms. This patch provides a mechanism to cover the other cases. Note that today we only expose pointer-translation related functions, but more could be added at a future date as needs arise. Example usage: QEMU instantiates a guest, and an external module foo that desires the ability to interface with the guest (say via open(/dev/foo)). QEMU may then pass the kvmfd to foo via an ioctl, such as: ioctl(foofd, FOO_SET_VMID, kvmfd). Upon receipt, the foo module can issue kvm_xinterface_bind(kvmfd) to acquire the proper context. Internally, the struct kvm* and associated struct module* will remain pinned at least until the foo module calls kvm_xinterface_put(). Signed-off-by: Gregory Haskins ghask...@novell.com --- arch/x86/kvm/Makefile |2 include/linux/kvm_host.h |3 include/linux/kvm_xinterface.h | 114 +++ kernel/fork.c |1 virt/kvm/kvm_main.c| 24 ++ virt/kvm/xinterface.c | 409 6 files changed, 552 insertions(+), 1 deletions(-) create mode 100644 include/linux/kvm_xinterface.h create mode 100644 virt/kvm/xinterface.c diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile index 31a7035..0449d6e 100644 --- a/arch/x86/kvm/Makefile +++ b/arch/x86/kvm/Makefile @@ -7,7 +7,7 @@ CFLAGS_vmx.o := -I. kvm-y += $(addprefix ../../../virt/kvm/, kvm_main.o ioapic.o \ coalesced_mmio.o irq_comm.o eventfd.o \ - assigned-dev.o) + assigned-dev.o xinterface.o) kvm-$(CONFIG_IOMMU_API)+= $(addprefix ../../../virt/kvm/, iommu.o) kvm-y += x86.o mmu.o emulate.o i8259.o irq.o lapic.o \ diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index b985a29..7cc1afb 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -362,6 +362,9 @@ void kvm_arch_sync_events(struct kvm *kvm); int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu); void kvm_vcpu_kick(struct kvm_vcpu *vcpu); +struct kvm_xinterface * +kvm_xinterface_alloc(struct kvm *kvm, struct module *owner); + int kvm_is_mmio_pfn(pfn_t pfn); struct kvm_irq_ack_notifier { diff --git a/include/linux/kvm_xinterface.h b/include/linux/kvm_xinterface.h new file mode 100644 index 000..01f092b --- /dev/null +++ b/include/linux/kvm_xinterface.h @@ -0,0 +1,114 @@ +#ifndef __KVM_XINTERFACE_H +#define __KVM_XINTERFACE_H + +/* + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + */ + +#include linux/kref.h +#include linux/module.h +#include linux/file.h + +struct kvm_xinterface; +struct kvm_xvmap; + +struct kvm_xinterface_ops { + unsigned long (*copy_to)(struct kvm_xinterface *intf, +unsigned long gpa, const void *src, +unsigned long len); + unsigned long (*copy_from)(struct kvm_xinterface *intf, void *dst, + unsigned long gpa, unsigned long len); + struct kvm_xvmap* (*vmap)(struct kvm_xinterface *intf, + unsigned long gpa, + unsigned long len); + void (*release)(struct kvm_xinterface *); +}; + +struct kvm_xinterface { + struct module *owner; + struct kref kref; + const struct kvm_xinterface_ops *ops; +}; + +static inline void +kvm_xinterface_get(struct kvm_xinterface *intf) +{ + kref_get(intf-kref); +} + +static inline void +_kvm_xinterface_release(struct kref *kref) +{ + struct kvm_xinterface *intf; + struct module *owner; + + intf = container_of(kref, struct kvm_xinterface, kref); + + owner = intf-owner; + rmb(); + + intf-ops-release(intf); + module_put(owner); +} + +static inline void +kvm_xinterface_put(struct kvm_xinterface *intf) +{ + kref_put(intf-kref, _kvm_xinterface_release); +} + +struct kvm_xvmap_ops { + void (*release)(struct
[PATCH v2 3/4] KVM: add io services to xinterface
We want to add a more efficient way to get PIO signals out of the guest, so we add an xioevent interface. This allows a client to register for notifications when a specific MMIO/PIO address is touched by the guest. This is an alternative interface to ioeventfd, which is performance limited by io-bus scaling and eventfd wait-queue based notification mechanism. This also has the advantage of retaining the full PIO data payload and passing it to the recipient. Signed-off-by: Gregory Haskins ghask...@novell.com --- include/linux/kvm_xinterface.h | 47 ++ virt/kvm/xinterface.c | 106 2 files changed, 153 insertions(+), 0 deletions(-) diff --git a/include/linux/kvm_xinterface.h b/include/linux/kvm_xinterface.h index 01f092b..684b6f8 100644 --- a/include/linux/kvm_xinterface.h +++ b/include/linux/kvm_xinterface.h @@ -12,6 +12,16 @@ struct kvm_xinterface; struct kvm_xvmap; +struct kvm_xioevent; + +enum { + kvm_xioevent_flag_nr_pio, + kvm_xioevent_flag_nr_max, +}; + +#define KVM_XIOEVENT_FLAG_PIO (1 kvm_xioevent_flag_nr_pio) + +#define KVM_XIOEVENT_VALID_FLAG_MASK ((1 kvm_xioevent_flag_nr_max) - 1) struct kvm_xinterface_ops { unsigned long (*copy_to)(struct kvm_xinterface *intf, @@ -22,6 +32,10 @@ struct kvm_xinterface_ops { struct kvm_xvmap* (*vmap)(struct kvm_xinterface *intf, unsigned long gpa, unsigned long len); + struct kvm_xioevent* (*ioevent)(struct kvm_xinterface *intf, + u64 addr, + unsigned long len, + unsigned long flags); void (*release)(struct kvm_xinterface *); }; @@ -109,6 +123,39 @@ kvm_xvmap_put(struct kvm_xvmap *vmap) kref_put(vmap-kref, _kvm_xvmap_release); } +struct kvm_xioevent_ops { + void (*deassign)(struct kvm_xioevent *ioevent); +}; + +struct kvm_xioevent { + const struct kvm_xioevent_ops *ops; + struct kvm_xinterface *intf; + void (*signal)(struct kvm_xioevent *ioevent, const void *val); + void *priv; +}; + +static inline void +kvm_xioevent_init(struct kvm_xioevent *ioevent, + const struct kvm_xioevent_ops *ops, + struct kvm_xinterface *intf) +{ + memset(ioevent, 0, sizeof(vmap)); + ioevent-ops = ops; + ioevent-intf = intf; + + kvm_xinterface_get(intf); +} + +static inline void +kvm_xioevent_deassign(struct kvm_xioevent *ioevent) +{ + struct kvm_xinterface *intf = ioevent-intf; + rmb(); + + ioevent-ops-deassign(ioevent); + kvm_xinterface_put(intf); +} + struct kvm_xinterface *kvm_xinterface_bind(int fd); #endif /* __KVM_XINTERFACE_H */ diff --git a/virt/kvm/xinterface.c b/virt/kvm/xinterface.c index 3b586c5..c356835 100644 --- a/virt/kvm/xinterface.c +++ b/virt/kvm/xinterface.c @@ -28,6 +28,8 @@ #include linux/kvm_host.h #include linux/kvm_xinterface.h +#include iodev.h + struct _xinterface { struct kvm *kvm; struct task_struct *task; @@ -42,6 +44,14 @@ struct _xvmap { struct kvm_xvmap vmap; }; +struct _ioevent { + u64 addr; + int length; + struct kvm_io_bus*bus; + struct kvm_io_device dev; + struct kvm_xioevent ioevent; +}; + static struct _xinterface * to_intf(struct kvm_xinterface *intf) { @@ -362,6 +372,101 @@ fail: return ERR_PTR(ret); } +/* MMIO/PIO writes trigger an event if the addr/val match */ +static int +ioevent_write(struct kvm_io_device *dev, gpa_t addr, int len, const void *val) +{ + struct _ioevent *p = container_of(dev, struct _ioevent, dev); + struct kvm_xioevent *ioevent = p-ioevent; + + if (!(addr == p-addr len == p-length)) + return -EOPNOTSUPP; + + if (!ioevent-signal) + return 0; + + ioevent-signal(ioevent, val); + return 0; +} + +static const struct kvm_io_device_ops ioevent_device_ops = { + .write = ioevent_write, +}; + +static void +ioevent_deassign(struct kvm_xioevent *ioevent) +{ + struct _ioevent*p = container_of(ioevent, struct _ioevent, ioevent); + struct _xinterface *_intf = to_intf(ioevent-intf); + struct kvm *kvm = _intf-kvm; + + kvm_io_bus_unregister_dev(kvm, p-bus, p-dev); + kfree(p); +} + +static const struct kvm_xioevent_ops ioevent_intf_ops = { + .deassign = ioevent_deassign, +}; + +static struct kvm_xioevent* +xinterface_ioevent(struct kvm_xinterface *intf, + u64 addr, + unsigned long len, + unsigned long flags) +{ + struct _xinterface *_intf = to_intf(intf); + struct kvm *kvm = _intf-kvm; + int pio = flags
[PATCH v2 4/4] KVM: add scatterlist support to xinterface
This allows a scatter-gather approach to IO, which will be useful for building high performance interfaces, like zero-copy and low-latency copy (avoiding multiple calls to copy_to/from). The interface is based on the existing scatterlist infrastructure. The caller is expected to pass in a scatterlist with its dma field populated with valid GPAs. The xinterface will then populate each entry by translating the GPA to a page*. The caller signifies completion by simply performing a put_page() on each page returned in the list. Signed-off-by: Gregory Haskins ghask...@novell.com --- include/linux/kvm_xinterface.h |4 ++ virt/kvm/xinterface.c | 72 2 files changed, 76 insertions(+), 0 deletions(-) diff --git a/include/linux/kvm_xinterface.h b/include/linux/kvm_xinterface.h index 684b6f8..eefb575 100644 --- a/include/linux/kvm_xinterface.h +++ b/include/linux/kvm_xinterface.h @@ -9,6 +9,7 @@ #include linux/kref.h #include linux/module.h #include linux/file.h +#include linux/scatterlist.h struct kvm_xinterface; struct kvm_xvmap; @@ -36,6 +37,9 @@ struct kvm_xinterface_ops { u64 addr, unsigned long len, unsigned long flags); + unsigned long (*sgmap)(struct kvm_xinterface *intf, + struct scatterlist *sgl, int nents, + unsigned long flags); void (*release)(struct kvm_xinterface *); }; diff --git a/virt/kvm/xinterface.c b/virt/kvm/xinterface.c index c356835..16729f6 100644 --- a/virt/kvm/xinterface.c +++ b/virt/kvm/xinterface.c @@ -467,6 +467,77 @@ fail: } +static unsigned long +xinterface_sgmap(struct kvm_xinterface *intf, +struct scatterlist *sgl, int nents, +unsigned long flags) +{ + struct _xinterface *_intf = to_intf(intf); + struct task_struct *p = _intf-task; + struct mm_struct *mm = _intf-mm; + struct kvm *kvm = _intf-kvm; + struct kvm_memory_slot *memslot = NULL; + boolkthread = !current-mm; + int ret; + struct scatterlist *sg; + int i; + + down_read(kvm-slots_lock); + + if (kthread) + use_mm(_intf-mm); + + for_each_sg(sgl, sg, nents, i) { + unsigned long gpa= sg_dma_address(sg); + unsigned long len= sg_dma_len(sg); + unsigned long gfn= gpa PAGE_SHIFT; + off_t offset = offset_in_page(gpa); + unsigned long hva; + struct page*pg; + + /* ensure that we do not have more than one page per entry */ + if ((PAGE_ALIGN(len + offset) PAGE_SHIFT) != 1) { + ret = -EINVAL; + break; + } + + /* check for a memslot-cache miss */ + if (!memslot + || gfn memslot-base_gfn + || gfn = memslot-base_gfn + memslot-npages) { + memslot = gfn_to_memslot(kvm, gfn); + if (!memslot) { + ret = -EFAULT; + break; + } + } + + hva = (memslot-userspace_addr + + (gfn - memslot-base_gfn) * PAGE_SIZE); + + if (kthread || current-mm == mm) + ret = get_user_pages_fast(hva, 1, 1, pg); + else + ret = get_user_pages(p, mm, hva, 1, 1, 0, pg, NULL); + + if (ret != 1) { + if (ret = 0) + ret = -EFAULT; + break; + } + + sg_set_page(sg, pg, len, offset); + ret = 0; + } + + if (kthread) + unuse_mm(_intf-mm); + + up_read(kvm-slots_lock); + + return ret; +} + static void xinterface_release(struct kvm_xinterface *intf) { @@ -483,6 +554,7 @@ struct kvm_xinterface_ops _xinterface_ops = { .copy_from = xinterface_copy_from, .vmap= xinterface_vmap, .ioevent = xinterface_ioevent, + .sgmap = xinterface_sgmap, .release = xinterface_release, }; -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM: VMX: flush TLB with INVEPT on cpu migration
On Thu, 2009-10-01 at 19:16 -0300, Marcelo Tosatti wrote: It is possible that stale EPTP-tagged mappings are used, if a vcpu migrates to a different pcpu. Set KVM_REQ_TLB_FLUSH in vmx_vcpu_load, when switching pcpus, which will invalidate both VPID and EPT mappings on the next vm-entry. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index e86f1a6..97f4265 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -708,7 +708,7 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu) if (vcpu-cpu != cpu) { vcpu_clear(vmx); kvm_migrate_timers(vcpu); - vpid_sync_vcpu_all(vmx); + set_bit(KVM_REQ_TLB_FLUSH, vcpu-requests); local_irq_disable(); list_add(vmx-local_vcpus_link, per_cpu(vcpus_on_cpu, cpu)); -- This patch fixes my ept misconfig problem seen very so often while installing sles11 guest. thanks, RP -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] KVM: x86: Refactor guest debug IOCTL handling
Much of so far vendor-specific code for setting up guest debug can actually be handled by the generic code. This also fixes a minor deficit in the SVM part /wrt processing KVM_GUESTDBG_ENABLE. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- arch/x86/include/asm/kvm_host.h |4 ++-- arch/x86/kvm/svm.c | 14 ++ arch/x86/kvm/vmx.c | 18 +- arch/x86/kvm/x86.c | 28 +--- 4 files changed, 26 insertions(+), 38 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 295c7c4..e7f8708 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -475,8 +475,8 @@ struct kvm_x86_ops { void (*vcpu_load)(struct kvm_vcpu *vcpu, int cpu); void (*vcpu_put)(struct kvm_vcpu *vcpu); - int (*set_guest_debug)(struct kvm_vcpu *vcpu, - struct kvm_guest_debug *dbg); + void (*set_guest_debug)(struct kvm_vcpu *vcpu, + struct kvm_guest_debug *dbg); int (*get_msr)(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata); int (*set_msr)(struct kvm_vcpu *vcpu, u32 msr_index, u64 data); u64 (*get_segment_base)(struct kvm_vcpu *vcpu, int seg); diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 02a4269..279a2ae 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -1065,26 +1065,16 @@ static void update_db_intercept(struct kvm_vcpu *vcpu) vcpu-guest_debug = 0; } -static int svm_guest_debug(struct kvm_vcpu *vcpu, struct kvm_guest_debug *dbg) +static void svm_guest_debug(struct kvm_vcpu *vcpu, struct kvm_guest_debug *dbg) { - int old_debug = vcpu-guest_debug; struct vcpu_svm *svm = to_svm(vcpu); - vcpu-guest_debug = dbg-control; - - update_db_intercept(vcpu); - if (vcpu-guest_debug KVM_GUESTDBG_USE_HW_BP) svm-vmcb-save.dr7 = dbg-arch.debugreg[7]; else svm-vmcb-save.dr7 = vcpu-arch.dr7; - if (vcpu-guest_debug KVM_GUESTDBG_SINGLESTEP) - svm-vmcb-save.rflags |= X86_EFLAGS_TF | X86_EFLAGS_RF; - else if (old_debug KVM_GUESTDBG_SINGLESTEP) - svm-vmcb-save.rflags = ~(X86_EFLAGS_TF | X86_EFLAGS_RF); - - return 0; + update_db_intercept(vcpu); } static void load_host_msrs(struct kvm_vcpu *vcpu) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 97f4265..70020e5 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -1096,30 +1096,14 @@ static void vmx_cache_reg(struct kvm_vcpu *vcpu, enum kvm_reg reg) } } -static int set_guest_debug(struct kvm_vcpu *vcpu, struct kvm_guest_debug *dbg) +static void set_guest_debug(struct kvm_vcpu *vcpu, struct kvm_guest_debug *dbg) { - int old_debug = vcpu-guest_debug; - unsigned long flags; - - vcpu-guest_debug = dbg-control; - if (!(vcpu-guest_debug KVM_GUESTDBG_ENABLE)) - vcpu-guest_debug = 0; - if (vcpu-guest_debug KVM_GUESTDBG_USE_HW_BP) vmcs_writel(GUEST_DR7, dbg-arch.debugreg[7]); else vmcs_writel(GUEST_DR7, vcpu-arch.dr7); - flags = vmcs_readl(GUEST_RFLAGS); - if (vcpu-guest_debug KVM_GUESTDBG_SINGLESTEP) - flags |= X86_EFLAGS_TF | X86_EFLAGS_RF; - else if (old_debug KVM_GUESTDBG_SINGLESTEP) - flags = ~(X86_EFLAGS_TF | X86_EFLAGS_RF); - vmcs_writel(GUEST_RFLAGS, flags); - update_exception_bitmap(vcpu); - - return 0; } static __init int cpu_has_kvm_support(void) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ffccb5c..aa5d574 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4470,12 +4470,19 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu, int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu, struct kvm_guest_debug *dbg) { - int i, r; + unsigned long rflags; + int old_debug; + int i; vcpu_load(vcpu); - if ((dbg-control (KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_HW_BP)) == - (KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_HW_BP)) { + old_debug = vcpu-guest_debug; + + vcpu-guest_debug = dbg-control; + if (!(vcpu-guest_debug KVM_GUESTDBG_ENABLE)) + vcpu-guest_debug = 0; + + if (vcpu-guest_debug KVM_GUESTDBG_USE_HW_BP) { for (i = 0; i KVM_NR_DB_REGS; ++i) vcpu-arch.eff_db[i] = dbg-arch.debugreg[i]; vcpu-arch.switch_db_regs = @@ -4486,16 +4493,23 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu, vcpu-arch.switch_db_regs = (vcpu-arch.dr7 DR7_BP_EN_MASK); } - r = kvm_x86_ops-set_guest_debug(vcpu, dbg); + rflags = kvm_x86_ops-get_rflags(vcpu); + if (vcpu-guest_debug KVM_GUESTDBG_SINGLESTEP) +
[PATCH 2/2] KVM: x86: Preserve guest single-stepping on register
Give user space more flexibility /wrt its IOCTL order. So far updating the rflags via KVM_SET_REGS ignored potentially set single-step flags. Now they will be kept. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- arch/x86/kvm/x86.c |7 ++- 1 files changed, 6 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index aa5d574..9fbb4c8 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3853,6 +3853,8 @@ int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs) int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs) { + unsigned long rflags; + vcpu_load(vcpu); kvm_register_write(vcpu, VCPU_REGS_RAX, regs-rax); @@ -3876,8 +3878,11 @@ int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs) #endif kvm_rip_write(vcpu, regs-rip); - kvm_x86_ops-set_rflags(vcpu, regs-rflags); + rflags = regs-rflags; + if (vcpu-guest_debug KVM_GUESTDBG_SINGLESTEP) + rflags |= X86_EFLAGS_TF | X86_EFLAGS_RF; + kvm_x86_ops-set_rflags(vcpu, rflags); vcpu-arch.exception.pending = false; -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][retry 2] Support Pause Filter in AMD processors
On Fri, Oct 02, 2009 at 02:49:59PM -0500, Mark Langsdorf wrote: +static int pause_interception(struct vcpu_svm *svm) +{ + static int pause_count = 0; + + kvm_vcpu_on_spin((svm-vcpu)); +printk(KERN_ERR MJLL pause intercepted %d\n, ++pause_count); Debugging leftover? + return 1; +} -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Release plan for 0.12.0
Anthony Liguori さんは書きました: Hi, Now that 0.11.0 is behind us, it's time to start thinking about 0.12.0. I'd like to do a few things different this time around. I don't think the -rc process went very well as I don't think we got more testing out of it. I'd like to shorten the timeline for 0.12.0 a good bit. The 0.10 stable tree got pretty difficult to maintain toward the end of the cycle. We also had a pretty huge amount of change between 0.10 and 0.11 so I think a shorter cycle is warranted. I think aiming for early to mid-December would give us roughly a 3 month cycle and would align well with some of the Linux distribution cycles. I'd like to limit things to a single -rc that lasted only for about a week. This is enough time to fix most of the obvious issues I think. I'd also like to try to enumerate some features for this release. Here's a short list of things I expect to see for this release (target-i386 centric). Please add or comment on items that you'd either like to see in the release or are planning on working on. o VMState conversion -- I expect most of the pc target to be completed o qdev conversion -- I hope that we'll get most of the pc target completely converted to qdev o storage live migration o switch to SeaBIOS (need to finish porting features from Bochs) o switch to gPXE (need to resolve slirp tftp server issue) o KSM integration o in-kernel APIC support for KVM o guest SMP support for KVM o updates to the default pc machine type Please add to this list and I'll collect it all and post it somewhere. o NEC PC-9821 family support on target-i386 In the last patch MS-DOS can boot on QEMU. I think I can add support nic (LGY-98) and IDE in 0.12.0 and I hope I can boot FreeBSD/pc98 on it. PS. I will repost v3 patch in the next week, please wait reviewing v2 patch I post Oct.1. Thanks, TAKEDA, toshiya Thanks! -- Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
(no subject)
subscribe kvm -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html