[PATCH v5 REPOST 6/6] hw_random: don't init list element we're about to add to list.
From: Rusty Russell ru...@rustcorp.com.au Another interesting anti-pattern. Signed-off-by: Rusty Russell ru...@rustcorp.com.au --- drivers/char/hw_random/core.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c index a9286bf..4d13ac5 100644 --- a/drivers/char/hw_random/core.c +++ b/drivers/char/hw_random/core.c @@ -489,7 +489,6 @@ int hwrng_register(struct hwrng *rng) goto out_unlock; } } - INIT_LIST_HEAD(rng-list); list_add_tail(rng-list, rng_list); if (old_rng !rng-init) { -- 1.9.3 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v5 REPOST 3/6] hw_random: use reference counts on each struct hwrng.
From: Rusty Russell ru...@rustcorp.com.au current_rng holds one reference, and we bump it every time we want to do a read from it. This means we only hold the rng_mutex to grab or drop a reference, so accessing /sys/devices/virtual/misc/hw_random/rng_current doesn't block on read of /dev/hwrng. Using a kref is overkill (we're always under the rng_mutex), but a standard pattern. This also solves the problem that the hwrng_fillfn thread was accessing current_rng without a lock, which could change (eg. to NULL) underneath it. v5: drop redundant kref_init() v4: decrease last reference for triggering the cleanup v3: initialize kref (thanks Amos Kong) v2: fix missing put_rng() on exit path (thanks Amos Kong) Signed-off-by: Rusty Russell ru...@rustcorp.com.au Signed-off-by: Amos Kong ak...@redhat.com --- drivers/char/hw_random/core.c | 135 -- include/linux/hw_random.h | 2 + 2 files changed, 94 insertions(+), 43 deletions(-) diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c index a0905c8..83516cb 100644 --- a/drivers/char/hw_random/core.c +++ b/drivers/char/hw_random/core.c @@ -42,6 +42,7 @@ #include linux/delay.h #include linux/slab.h #include linux/random.h +#include linux/err.h #include asm/uaccess.h @@ -91,6 +92,60 @@ static void add_early_randomness(struct hwrng *rng) add_device_randomness(bytes, bytes_read); } +static inline void cleanup_rng(struct kref *kref) +{ + struct hwrng *rng = container_of(kref, struct hwrng, ref); + + if (rng-cleanup) + rng-cleanup(rng); +} + +static void set_current_rng(struct hwrng *rng) +{ + BUG_ON(!mutex_is_locked(rng_mutex)); + kref_get(rng-ref); + current_rng = rng; +} + +static void drop_current_rng(void) +{ + BUG_ON(!mutex_is_locked(rng_mutex)); + if (!current_rng) + return; + + /* decrease last reference for triggering the cleanup */ + kref_put(current_rng-ref, cleanup_rng); + current_rng = NULL; +} + +/* Returns ERR_PTR(), NULL or refcounted hwrng */ +static struct hwrng *get_current_rng(void) +{ + struct hwrng *rng; + + if (mutex_lock_interruptible(rng_mutex)) + return ERR_PTR(-ERESTARTSYS); + + rng = current_rng; + if (rng) + kref_get(rng-ref); + + mutex_unlock(rng_mutex); + return rng; +} + +static void put_rng(struct hwrng *rng) +{ + /* +* Hold rng_mutex here so we serialize in case they set_current_rng +* on rng again immediately. +*/ + mutex_lock(rng_mutex); + if (rng) + kref_put(rng-ref, cleanup_rng); + mutex_unlock(rng_mutex); +} + static inline int hwrng_init(struct hwrng *rng) { if (rng-init) { @@ -113,12 +168,6 @@ static inline int hwrng_init(struct hwrng *rng) return 0; } -static inline void hwrng_cleanup(struct hwrng *rng) -{ - if (rng rng-cleanup) - rng-cleanup(rng); -} - static int rng_dev_open(struct inode *inode, struct file *filp) { /* enforce read-only access to this chrdev */ @@ -154,21 +203,22 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf, ssize_t ret = 0; int err = 0; int bytes_read, len; + struct hwrng *rng; while (size) { - if (mutex_lock_interruptible(rng_mutex)) { - err = -ERESTARTSYS; + rng = get_current_rng(); + if (IS_ERR(rng)) { + err = PTR_ERR(rng); goto out; } - - if (!current_rng) { + if (!rng) { err = -ENODEV; - goto out_unlock; + goto out; } mutex_lock(reading_mutex); if (!data_avail) { - bytes_read = rng_get_data(current_rng, rng_buffer, + bytes_read = rng_get_data(rng, rng_buffer, rng_buffer_size(), !(filp-f_flags O_NONBLOCK)); if (bytes_read 0) { @@ -200,8 +250,8 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf, ret += len; } - mutex_unlock(rng_mutex); mutex_unlock(reading_mutex); + put_rng(rng); if (need_resched()) schedule_timeout_interruptible(1); @@ -213,12 +263,11 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf, } out: return ret ? : err; -out_unlock: - mutex_unlock(rng_mutex); - goto out; + out_unlock_reading: mutex_unlock(reading_mutex); - goto out_unlock; + put_rng(rng); + goto out; } @@ -257,8 +306,8 @@ static ssize_t hwrng_attr_current_store(struct device *dev,
[PATCH v5 REPOST 2/6] hw_random: move some code out mutex_lock for avoiding underlying deadlock
In next patch, we use reference counting for each struct hwrng, changing reference count also needs to take mutex_lock. Before releasing the lock, if we try to stop a kthread that waits to take the lock to reduce the referencing count, deadlock will occur. Signed-off-by: Amos Kong ak...@redhat.com --- drivers/char/hw_random/core.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c index b1b6042..a0905c8 100644 --- a/drivers/char/hw_random/core.c +++ b/drivers/char/hw_random/core.c @@ -474,12 +474,12 @@ void hwrng_unregister(struct hwrng *rng) } } if (list_empty(rng_list)) { + mutex_unlock(rng_mutex); unregister_miscdev(); if (hwrng_fill) kthread_stop(hwrng_fill); - } - - mutex_unlock(rng_mutex); + } else + mutex_unlock(rng_mutex); } EXPORT_SYMBOL_GPL(hwrng_unregister); -- 1.9.3 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v5 REPOST 1/6] hw_random: place mutex around read functions and buffers.
From: Rusty Russell ru...@rustcorp.com.au There's currently a big lock around everything, and it means that we can't query sysfs (eg /sys/devices/virtual/misc/hw_random/rng_current) while the rng is reading. This is a real problem when the rng is slow, or blocked (eg. virtio_rng with qemu's default /dev/random backend) This doesn't help (it leaves the current lock untouched), just adds a lock to protect the read function and the static buffers, in preparation for transition. Signed-off-by: Rusty Russell ru...@rustcorp.com.au --- drivers/char/hw_random/core.c | 20 +--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c index aa30a25..b1b6042 100644 --- a/drivers/char/hw_random/core.c +++ b/drivers/char/hw_random/core.c @@ -53,7 +53,10 @@ static struct hwrng *current_rng; static struct task_struct *hwrng_fill; static LIST_HEAD(rng_list); +/* Protects rng_list and current_rng */ static DEFINE_MUTEX(rng_mutex); +/* Protects rng read functions, data_avail, rng_buffer and rng_fillbuf */ +static DEFINE_MUTEX(reading_mutex); static int data_avail; static u8 *rng_buffer, *rng_fillbuf; static unsigned short current_quality; @@ -81,7 +84,9 @@ static void add_early_randomness(struct hwrng *rng) unsigned char bytes[16]; int bytes_read; + mutex_lock(reading_mutex); bytes_read = rng_get_data(rng, bytes, sizeof(bytes), 1); + mutex_unlock(reading_mutex); if (bytes_read 0) add_device_randomness(bytes, bytes_read); } @@ -128,6 +133,7 @@ static inline int rng_get_data(struct hwrng *rng, u8 *buffer, size_t size, int wait) { int present; + BUG_ON(!mutex_is_locked(reading_mutex)); if (rng-read) return rng-read(rng, (void *)buffer, size, wait); @@ -160,13 +166,14 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf, goto out_unlock; } + mutex_lock(reading_mutex); if (!data_avail) { bytes_read = rng_get_data(current_rng, rng_buffer, rng_buffer_size(), !(filp-f_flags O_NONBLOCK)); if (bytes_read 0) { err = bytes_read; - goto out_unlock; + goto out_unlock_reading; } data_avail = bytes_read; } @@ -174,7 +181,7 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf, if (!data_avail) { if (filp-f_flags O_NONBLOCK) { err = -EAGAIN; - goto out_unlock; + goto out_unlock_reading; } } else { len = data_avail; @@ -186,7 +193,7 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf, if (copy_to_user(buf + ret, rng_buffer + data_avail, len)) { err = -EFAULT; - goto out_unlock; + goto out_unlock_reading; } size -= len; @@ -194,6 +201,7 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf, } mutex_unlock(rng_mutex); + mutex_unlock(reading_mutex); if (need_resched()) schedule_timeout_interruptible(1); @@ -208,6 +216,9 @@ out: out_unlock: mutex_unlock(rng_mutex); goto out; +out_unlock_reading: + mutex_unlock(reading_mutex); + goto out_unlock; } @@ -348,13 +359,16 @@ static int hwrng_fillfn(void *unused) while (!kthread_should_stop()) { if (!current_rng) break; + mutex_lock(reading_mutex); rc = rng_get_data(current_rng, rng_fillbuf, rng_buffer_size(), 1); + mutex_unlock(reading_mutex); if (rc = 0) { pr_warn(hwrng: no data available\n); msleep_interruptible(1); continue; } + /* Outside lock, sure, but y'know: randomness. */ add_hwgenerator_randomness((void *)rng_fillbuf, rc, rc * current_quality * 8 10); } -- 1.9.3 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v5 REPOST 0/6] fix hw_random stuck
When I hotunplug a busy virtio-rng device or try to access hwrng attributes in non-smp guest, it gets stuck. My hotplug tests: | test 0: | hotunplug rng device from qemu monitor | | test 1: | guest) # dd if=/dev/hwrng of=/dev/null | hotunplug rng device from qemu monitor | | test 2: | guest) # dd if=/dev/random of=/dev/null | hotunplug rng device from qemu monitor | | test 4: | guest) # dd if=/dev/hwrng of=/dev/null | cat /sys/devices/virtual/misc/hw_random/rng_* | | test 5: | guest) # dd if=/dev/hwrng of=/dev/null | cancel dd process after 10 seconds | guest) # dd if=/dev/hwrng of=/dev/null | hotunplug rng device from qemu monitor | | test 6: | use a fifo as rng backend, execute test 0 ~ 5 with no input of fifo V5: reset cleanup_done flag, drop redundant init of reference count, use compiler barrier to prevent recording. V4: update patch 4 to fix corrupt, decrease last reference for triggering the cleanup, fix unregister race pointed by Herbert V3: initialize kref to 1 V2: added patch 2 to fix a deadlock, update current patch 3 to fix reference counting issue Amos Kong (1): hw_random: move some code out mutex_lock for avoiding underlying deadlock Rusty Russell (5): hw_random: place mutex around read functions and buffers. hw_random: use reference counts on each struct hwrng. hw_random: fix unregister race. hw_random: don't double-check old_rng. hw_random: don't init list element we're about to add to list. drivers/char/hw_random/core.c | 173 ++ include/linux/hw_random.h | 3 + 2 files changed, 126 insertions(+), 50 deletions(-) -- 1.9.3 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v5 REPOST 5/6] hw_random: don't double-check old_rng.
From: Rusty Russell ru...@rustcorp.com.au Interesting anti-pattern. Signed-off-by: Rusty Russell ru...@rustcorp.com.au --- drivers/char/hw_random/core.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c index 067270b..a9286bf 100644 --- a/drivers/char/hw_random/core.c +++ b/drivers/char/hw_random/core.c @@ -476,14 +476,13 @@ int hwrng_register(struct hwrng *rng) } old_rng = current_rng; + err = 0; if (!old_rng) { err = hwrng_init(rng); if (err) goto out_unlock; set_current_rng(rng); - } - err = 0; - if (!old_rng) { + err = register_miscdev(); if (err) { drop_current_rng(); -- 1.9.3 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v5 REPOST 4/6] hw_random: fix unregister race.
From: Rusty Russell ru...@rustcorp.com.au The previous patch added one potential problem: we can still be reading from a hwrng when it's unregistered. Add a wait for zero in the hwrng_unregister path. v5: reset cleanup_done flag, use compiler barrier to prevent recording. v4: add cleanup_done flag to insure that cleanup is done Signed-off-by: Rusty Russell ru...@rustcorp.com.au Signed-off-by: Amos Kong ak...@redhat.com --- drivers/char/hw_random/core.c | 12 include/linux/hw_random.h | 1 + 2 files changed, 13 insertions(+) diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c index 83516cb..067270b 100644 --- a/drivers/char/hw_random/core.c +++ b/drivers/char/hw_random/core.c @@ -60,6 +60,7 @@ static DEFINE_MUTEX(rng_mutex); static DEFINE_MUTEX(reading_mutex); static int data_avail; static u8 *rng_buffer, *rng_fillbuf; +static DECLARE_WAIT_QUEUE_HEAD(rng_done); static unsigned short current_quality; static unsigned short default_quality; /* = 0; default to off */ @@ -98,6 +99,11 @@ static inline void cleanup_rng(struct kref *kref) if (rng-cleanup) rng-cleanup(rng); + + /* cleanup_done should be updated after cleanup finishes */ + smp_wmb(); + rng-cleanup_done = true; + wake_up_all(rng_done); } static void set_current_rng(struct hwrng *rng) @@ -498,6 +504,8 @@ int hwrng_register(struct hwrng *rng) add_early_randomness(rng); } + rng-cleanup_done = false; + out_unlock: mutex_unlock(rng_mutex); out: @@ -529,6 +537,10 @@ void hwrng_unregister(struct hwrng *rng) kthread_stop(hwrng_fill); } else mutex_unlock(rng_mutex); + + /* Just in case rng is reading right now, wait. */ + wait_event(rng_done, rng-cleanup_done + atomic_read(rng-ref.refcount) == 0); } EXPORT_SYMBOL_GPL(hwrng_unregister); diff --git a/include/linux/hw_random.h b/include/linux/hw_random.h index c212e71..7832e50 100644 --- a/include/linux/hw_random.h +++ b/include/linux/hw_random.h @@ -46,6 +46,7 @@ struct hwrng { /* internal. */ struct list_head list; struct kref ref; + bool cleanup_done; }; /** Register a new Hardware Random Number Generator driver. */ -- 1.9.3 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ANNOUNCE][RFC] KVMGT - the implementation of Intel GVT-g(full GPU virtualization) for KVM
On Sa, 2014-12-06 at 12:17 +0800, Jike Song wrote: On 12/05/2014 04:50 PM, Gerd Hoffmann wrote: A few comments on the kernel stuff (brief look so far, also compile-tested only, intel gfx on my test machine is too old). * Noticed the kernel bits don't even compile when configured as module. Everything (vgt, i915, kvm) must be compiled into the kernel. Yes, that's planned to be done along with separating hypervisor-related code from vgt. Good. What are the exact requirements for the device? Must it match the host exactly, to not confuse the guest intel graphics driver? Or would something more recent -- such as the q35 emulation qemu has -- be good enough to make things work (assuming we add support for the graphic-related pci config space registers there)? I don't know that is exactly needed, we also need to have Windows driver considered. However, I'm quite confident that, if things gonna work for IGD passthrough, it gonna work for GVT-g. I'd suggest to focus on q35 emulation. q35 is new enough that a version with integrated graphics exists, so the gap we have to close is *much* smaller. In case guests expect a northbridge matching the chipset generation of the graphics device (which I'd expect is the case, after digging a bit in the igd and agpgart linux driver code) I think we should add proper device emulation for them, i.e. comply q35-pcihost with sandybridge-pcihost + ivybridge-pcihost + haswell-pcihost instead of just copying over the pci ids from the host. Most likely all those variants can share most of the emulation code. SeaBIOS then can just get support for these three northbridge variants, so we don't need magic pci id switching hacks at all. The patch also adds a dummy isa bridge at 0x1f. Simliar question here: What exactly is needed here? Would things work if we simply use the q35 lpc device here? Ditto. Ok. Lets try to just use the q35 emulation + q35 lpc device then instead of adding a second dummy lpc device. cheers, Gerd -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [v2 18/25] KVM: kvm-vfio: implement the VFIO skeleton for VT-d Posted-Interrupts
On 12/08/2014 06:12 AM, Alex Williamson wrote: On Mon, 2014-12-08 at 04:58 +, Wu, Feng wrote: -Original Message- From: Eric Auger [mailto:eric.au...@linaro.org] Sent: Thursday, December 04, 2014 11:36 PM To: Wu, Feng; t...@linutronix.de; mi...@redhat.com; h...@zytor.com; x...@kernel.org; g...@kernel.org; pbonz...@redhat.com; dw...@infradead.org; j...@8bytes.org; alex.william...@redhat.com; jiang@linux.intel.com Cc: linux-ker...@vger.kernel.org; io...@lists.linux-foundation.org; kvm@vger.kernel.org Subject: Re: [v2 18/25] KVM: kvm-vfio: implement the VFIO skeleton for VT-d Posted-Interrupts Hi Feng, On 12/03/2014 08:39 AM, Feng Wu wrote: This patch adds the kvm-vfio interface for VT-d Posted-Interrrupts. When guests updates MSI/MSI-x information for an assigned-device, update QEMU will use KVM_DEV_VFIO_DEVICE_POSTING_IRQ attribute to setup IRTE for VT-d PI. This patch implement this IRQ attribute. s/implement/implements Signed-off-by: Feng Wu feng...@intel.com --- include/linux/kvm_host.h | 19 virt/kvm/vfio.c | 103 ++ 2 files changed, 122 insertions(+), 0 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 5cd4420..8d06678 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1134,6 +1134,25 @@ static inline int kvm_arch_vfio_set_forward(struct kvm_fwd_irq *fwd_irq, } #endif +#ifdef __KVM_HAVE_ARCH_KVM_VFIO_POSTING +/* + * kvm_arch_vfio_update_pi_irte - set IRTE for Posted-Interrupts + * + * @kvm: kvm + * @host_irq: host irq of the interrupt + * @guest_irq: gsi of the interrupt + * returns 0 on success, 0 on failure + */ +int kvm_arch_vfio_update_pi_irte(struct kvm *kvm, unsigned int host_irq, + uint32_t guest_irq); +#else +static int kvm_arch_vfio_update_pi_irte(struct kvm *kvm, unsigned int host_irq, + uint32_t guest_irq) +{ + return 0; +} +#endif + #ifdef CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT static inline void kvm_vcpu_set_in_spin_loop(struct kvm_vcpu *vcpu, bool val) diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c index 6bc7001..5e5515f 100644 --- a/virt/kvm/vfio.c +++ b/virt/kvm/vfio.c @@ -446,6 +446,99 @@ out: return ret; } +static int kvm_vfio_pci_get_irq_count(struct pci_dev *pdev, int irq_type) +{ + if (irq_type == VFIO_PCI_INTX_IRQ_INDEX) { + u8 pin; + + pci_read_config_byte(pdev, PCI_INTERRUPT_PIN, pin); + if (pin) + return 1; + } else if (irq_type == VFIO_PCI_MSI_IRQ_INDEX) + return pci_msi_vec_count(pdev); + else if (irq_type == VFIO_PCI_MSIX_IRQ_INDEX) + return pci_msix_vec_count(pdev); + + return 0; +} for platform case I was asked to move the retrieval of absolute irq number to the architecture specific part. I don't know if it should apply to PCI stuff as well? This explains why I need to pass the VFIO device (or struct device handle) to the arch specific part. Actually we do the same job, we provide a phys/virt IRQ mapping to KVM, right? So to me our architecture specific API should look quite similar? In my patch, QEMU passes IRQ type(MSI/MSIx in my case), VFIO device index, and sub-index via struct kvm_vfio_dev_irq to KVM, then KVM will find the real host irq from the VFIO device index and the IRQ type. Is this something similar with your patch? + +static int kvm_vfio_set_pi(struct kvm_device *kdev, int32_t __user *argp) +{ + struct kvm_vfio_dev_irq pi_info; + uint32_t *gsi; + unsigned long minsz; + struct vfio_device *vdev; + struct msi_desc *entry; + struct device *dev; + struct pci_dev *pdev; + int i, max, ret; + + minsz = offsetofend(struct kvm_vfio_dev_irq, count); + + if (copy_from_user(pi_info, (void __user *)argp, minsz)) + return -EFAULT; + + if (pi_info.argsz minsz || pi_info.index = VFIO_PCI_NUM_IRQS) PCI specific check, same remark as above but I will let Alex further comment on this and possibly invalidate this commeny ;-) + return -EINVAL; + + vdev = kvm_vfio_get_vfio_device(pi_info.fd); + if (IS_ERR(vdev)) + return PTR_ERR(vdev); + + dev = kvm_vfio_external_base_device(vdev); + if (!dev || !dev_is_pci(dev)) { + ret = -EFAULT; + goto put_vfio_device; + } + + pdev = to_pci_dev(dev); + + max = kvm_vfio_pci_get_irq_count(pdev, pi_info.index); + if (max = 0) { + ret = -EFAULT; + goto put_vfio_device; + } + + if (pi_info.argsz - minsz pi_info.count * sizeof(int) || shouldn' we use the actual datatype? I am afraid I don't get this, could you please be more specific? Thanks a lot! We could have a platform that supports 64bit INTs. yes this is what I meant (struct datatype is __u32). Thanks Eric + pi_info.start = max || pi_info.start + pi_info.count max) { + ret =
Re: [Intel-gfx] [ANNOUNCE][RFC] KVMGT - the implementation of Intel GVT-g(full GPU virtualization) for KVM
On Mon, Dec 08, 2014 at 10:55:01AM +0100, Gerd Hoffmann wrote: On Sa, 2014-12-06 at 12:17 +0800, Jike Song wrote: I don't know that is exactly needed, we also need to have Windows driver considered. However, I'm quite confident that, if things gonna work for IGD passthrough, it gonna work for GVT-g. I'd suggest to focus on q35 emulation. q35 is new enough that a version with integrated graphics exists, so the gap we have to close is *much* smaller. In case guests expect a northbridge matching the chipset generation of the graphics device (which I'd expect is the case, after digging a bit in the igd and agpgart linux driver code) I think we should add proper device emulation for them, i.e. comply q35-pcihost with sandybridge-pcihost + ivybridge-pcihost + haswell-pcihost instead of just copying over the pci ids from the host. Most likely all those variants can share most of the emulation code. I don't think i915.ko should care about either northbridge nor pch on para-virtualized platforms. We do noodle around in there for the oddball memory controller setting and for some display stuff. But neither of that really applies to paravirtualized hw. And if there's any case like that we should patch it out (like we do with some of the runtime pm code already). -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: usb audio device troubles
Hi Eric, On 03-12-14 16:39, Eric S. Johansson wrote: On 12/3/2014 3:52 AM, Hans de Goede wrote: Eric are you using usb-host redirection, or Spice's usb network redir ? This little bit of time this morning learning about spice and the network redirection. It worked for about half an hour and then failed in the same way the host redirection failed. The audio device would appear for a while, I would try to use it and then it would disappear. The spice model has some very nice features and that I could, in theory, have a working speech recognition engine somewhere on my air quotescloud/air quotes and then be able to use it via spice on any desktop I happen to be located in front of. it would also work nicely with my original idea of putting a working KVM virtual machine on and an e-sata SSD external drive and be able to bring my working speech recognition environment with me without having cart a laptop. I hope you can see that this could be generalized into a nicely portable accessibility solution where the accessibility environment moves with the disabled user and removes the need to make every machine have user specific accessibility software and configuration. Yes, it does impose a requirement the KVM runs everywhere but, we know that's the future anyway so why fight it :-) Anyway, I think if we can solve this USB audio device problem then I'll be very happy and can make further progress towards my goal. Thank you so very much for the help so far and I hope we can fix this USB problem. To further figure out what is going on when the usb device disconnects we will need some logs. For starters lets look at the spice-client side, before starting virt-manager or virt-viewer do the following in the terminal: export LIBUSB_DEBUG=4 And then start the application from the terminal like e.g. this: virt-manager virt-man.log Then do what you want to do with the usb headset until it disconnects, and once it has disconnected quit and attach the generated virt-man.log file to your next mail. Regards, Hans p.s. Below are instructions to gather logs on the qemu side. I do not need those right now, first lets do the client side logs, but since I've already looked up the instructions I thought it would be good to put them in this mail: Standard the libvirt qemu logs under: /var/log/libvirt/qemu/vm-name.log Should contain some minimal usb logging. If native usb is properly setup and a client connects which supports native usb, one would expect messages like this to show up there: qemu-system-x86_64: usbredirparser: Peer version: spice-gtk 0.21, using 64-bits ids If a message like the above does not show up then there is a problem with the vm config, and further more detailed debugging is not going to help. If you're debugging problems with a certain device it may be helpful to enable more verbose logging of usb traffic inside qemu, to do this, the vm's libvirt xml file needs to be edited like this: domain type='qemu' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0' ... devices ... /devices qemu:commandline qemu:arg value='-set'/ qemu:arg value='device.usbredir0.debug=4'/ /qemu:commandline /domain Note the first line needs to be changed and the qemu:commandline section is new. If there are more usbredir devices inside the xml (usually there are) then additional -set device.usbredir1.debug=4, etc. arguments must be added, ie: qemu:commandline qemu:arg value='-set'/ qemu:arg value='device.usbredir0.debug=4'/ qemu:arg value='-set'/ qemu:arg value='device.usbredir1.debug=4'/ /qemu:commandline Note this kind of detailed logging is only useful when a certain device fails, if no devices work at all there usual is a general configuration problem. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 3/5] KVM: ARM VGIC add kvm_io_bus_ frontend
On Fri, Dec 05, 2014 at 02:10:26PM +0200, Nikolay Nikolaev wrote: On Sat, Nov 29, 2014 at 3:54 PM, Nikolay Nikolaev n.nikol...@virtualopensystems.com wrote: On Sat, Nov 29, 2014 at 1:29 PM, Christoffer Dall christoffer.d...@linaro.org wrote: On Mon, Nov 24, 2014 at 11:26:58PM +0200, Nikolay Nikolaev wrote: In io_mem_abort remove the call to vgic_handle_mmio. The target is to have a single MMIO handling path - that is through the kvm_io_bus_ API. Register a kvm_io_device in kvm_vgic_init on the whole vGIC MMIO region. Both read and write calls are redirected to vgic_io_dev_access where kvm_exit_mmio is composed to pass it to vm_ops.handle_mmio. Signed-off-by: Nikolay Nikolaev n.nikol...@virtualopensystems.com --- arch/arm/kvm/mmio.c|3 -- include/kvm/arm_vgic.h |3 +- virt/kvm/arm/vgic.c| 88 3 files changed, 74 insertions(+), 20 deletions(-) diff --git a/arch/arm/kvm/mmio.c b/arch/arm/kvm/mmio.c index 81230da..1c44a2b 100644 --- a/arch/arm/kvm/mmio.c +++ b/arch/arm/kvm/mmio.c @@ -227,9 +227,6 @@ int io_mem_abort(struct kvm_vcpu *vcpu, struct kvm_run *run, if (mmio.is_write) mmio_write_buf(mmio.data, mmio.len, data); - if (vgic_handle_mmio(vcpu, run, mmio)) - return 1; - if (handle_kernel_mmio(vcpu, run, mmio)) return 1; diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h index e452ef7..d9b7d2a 100644 --- a/include/kvm/arm_vgic.h +++ b/include/kvm/arm_vgic.h @@ -233,6 +233,7 @@ struct vgic_dist { unsigned long *irq_pending_on_cpu; struct vgic_vm_ops vm_ops; + struct kvm_io_device*io_dev; #endif }; @@ -307,8 +308,6 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num, bool level); void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg); int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu); -bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run, - struct kvm_exit_mmio *mmio); #define irqchip_in_kernel(k) (!!((k)-arch.vgic.in_kernel)) #define vgic_initialized(k) ((k)-arch.vgic.ready) diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c index 1213da5..3da1115 100644 --- a/virt/kvm/arm/vgic.c +++ b/virt/kvm/arm/vgic.c @@ -31,6 +31,9 @@ #include asm/kvm_emulate.h #include asm/kvm_arm.h #include asm/kvm_mmu.h +#include asm/kvm.h + +#include iodev.h /* * How the whole thing works (courtesy of Christoffer Dall): @@ -775,28 +778,81 @@ bool vgic_handle_mmio_range(struct kvm_vcpu *vcpu, struct kvm_run *run, return true; } -/** - * vgic_handle_mmio - handle an in-kernel MMIO access for the GIC emulation - * @vcpu: pointer to the vcpu performing the access - * @run: pointer to the kvm_run structure - * @mmio: pointer to the data describing the access - * - * returns true if the MMIO access has been performed in kernel space, - * and false if it needs to be emulated in user space. - * Calls the actual handling routine for the selected VGIC model. - */ -bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run, - struct kvm_exit_mmio *mmio) +static int vgic_io_dev_access(struct kvm_vcpu *vcpu, struct kvm_io_device *this, + gpa_t addr, int len, void *val, bool is_write) { - if (!irqchip_in_kernel(vcpu-kvm)) - return false; + struct kvm_exit_mmio mmio; + bool ret; + + mmio = (struct kvm_exit_mmio) { + .phys_addr = addr, + .len = len, + .is_write = is_write, + }; + + if (is_write) + memcpy(mmio.data, val, len); /* * This will currently call either vgic_v2_handle_mmio() or * vgic_v3_handle_mmio(), which in turn will call * vgic_handle_mmio_range() defined above. */ - return vcpu-kvm-arch.vgic.vm_ops.handle_mmio(vcpu, run, mmio); + ret = vcpu-kvm-arch.vgic.vm_ops.handle_mmio(vcpu, vcpu-run, mmio); + + if (!is_write) + memcpy(val, mmio.data, len); + + return ret ? 0 : 1; +} + +static int vgic_io_dev_read(struct kvm_vcpu *vcpu, struct kvm_io_device *this, + gpa_t addr, int len, void *val) +{ + return vgic_io_dev_access(vcpu, this, addr, len, val, false); +} + +static int vgic_io_dev_write(struct kvm_vcpu *vcpu, struct kvm_io_device *this, +gpa_t addr, int len, const void *val) +{ + return vgic_io_dev_access(vcpu, this, addr, len, (void *)val, true); +} + +static const struct kvm_io_device_ops vgic_io_dev_ops = { + .read = vgic_io_dev_read, + .write = vgic_io_dev_write, +}; + +static int
Re: [PATCH v2 0/6] Improve PSCI system events and fix reboot bugs
On 3 December 2014 at 21:18, Christoffer Dall christoffer.d...@linaro.org wrote: Several people have reported problems with rebooting ARM VMs, especially on 32-bit ARM. This is mainly due to the same reason we were seeing boot errors in the past, namely that the ram, dcache, and icache weren't coherent on guest boot with the guest (stage-1) MMU disabled. We solved this by ensuring coherency when we fault in pages, but since most memory is already mapped after a reboot, we don't do anything. The solution is to unmap the regular RAM on VCPU init, but we must take care to not unmap the GIC or other IO regions, hence the somehwat complicated solution. As part of figuring this out, it became clear that some semantics around the KVM_ARM_VCPU_INIT ABI and system event ABI was unclear (what is userspace expected to do when it receives a system event). This series also clarifies the ABI and changes the kernel functionality to do what userspace expects (turn off VCPUs on a system shutdown event). Userspace ABI documentation clarifications: Reviewed-by: Peter Maydell peter.mayd...@linaro.org -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/6] arm/arm64: KVM: Don't clear the VCPU_POWER_OFF flag
On 03/12/14 21:18, Christoffer Dall wrote: If a VCPU was originally started with power off (typically to be brought up by PSCI in SMP configurations), there is no need to clear the POWER_OFF flag in the kernel, as this flag is only tested during the init ioctl itself. Signed-off-by: Christoffer Dall christoffer.d...@linaro.org --- arch/arm/kvm/arm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c index 9e193c8..b160bea 100644 --- a/arch/arm/kvm/arm.c +++ b/arch/arm/kvm/arm.c @@ -661,7 +661,7 @@ static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu, /* * Handle the start in power-off case by marking the VCPU as paused. */ - if (__test_and_clear_bit(KVM_ARM_VCPU_POWER_OFF, vcpu-arch.features)) + if (test_bit(KVM_ARM_VCPU_POWER_OFF, vcpu-arch.features)) vcpu-arch.pause = true; return 0; Acked-by: Marc Zyngier marc.zyng...@arm.com M. -- Jazz is not dead. It just smells funny... -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/6] arm/arm64: KVM: Correct KVM_ARM_VCPU_INIT power off option
On 03/12/14 21:18, Christoffer Dall wrote: The implementation of KVM_ARM_VCPU_INIT is currently not doing what userspace expects, namely making sure that a vcpu which may have been turned off using PSCI is returned to its initial state, which would be powered on if userspace does not set the KVM_ARM_VCPU_POWER_OFF flag. Implement the expected functionality and clarify the ABI. Signed-off-by: Christoffer Dall christoffer.d...@linaro.org --- Documentation/virtual/kvm/api.txt | 3 ++- arch/arm/kvm/arm.c| 2 ++ 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 7610eaa..bb82a90 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -2455,7 +2455,8 @@ should be created before this ioctl is invoked. Possible features: - KVM_ARM_VCPU_POWER_OFF: Starts the CPU in a power-off state. - Depends on KVM_CAP_ARM_PSCI. + Depends on KVM_CAP_ARM_PSCI. If not set, the CPU will be powered on + and execute guest code when KVM_RUN is called. - KVM_ARM_VCPU_EL1_32BIT: Starts the CPU in a 32bit mode. Depends on KVM_CAP_ARM_EL1_32BIT (arm64 only). - KVM_ARM_VCPU_PSCI_0_2: Emulate PSCI v0.2 for the CPU. diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c index b160bea..edc1964 100644 --- a/arch/arm/kvm/arm.c +++ b/arch/arm/kvm/arm.c @@ -663,6 +663,8 @@ static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu, */ if (test_bit(KVM_ARM_VCPU_POWER_OFF, vcpu-arch.features)) vcpu-arch.pause = true; + else + vcpu-arch.pause = false; return 0; } Acked-by: Marc Zyngier marc.zyng...@arm.com M. -- Jazz is not dead. It just smells funny... -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 3/6] arm/arm64: KVM: Reset the HCR on each vcpu when resetting the vcpu
On 03/12/14 21:18, Christoffer Dall wrote: When userspace resets the vcpu using KVM_ARM_VCPU_INIT, we should also reset the HCR, because we now modify the HCR dynamically to enable/disable trapping of guest accesses to the VM registers. This is crucial for reboot of VMs working since otherwise we will not be doing the necessary cache maintenance operations when faulting in pages with the guest MMU off. Signed-off-by: Christoffer Dall christoffer.d...@linaro.org --- arch/arm/include/asm/kvm_emulate.h | 5 + arch/arm/kvm/arm.c | 2 ++ arch/arm/kvm/guest.c | 1 - arch/arm64/include/asm/kvm_emulate.h | 5 + arch/arm64/kvm/guest.c | 1 - 5 files changed, 12 insertions(+), 2 deletions(-) diff --git a/arch/arm/include/asm/kvm_emulate.h b/arch/arm/include/asm/kvm_emulate.h index b9db269..66ce176 100644 --- a/arch/arm/include/asm/kvm_emulate.h +++ b/arch/arm/include/asm/kvm_emulate.h @@ -33,6 +33,11 @@ void kvm_inject_undefined(struct kvm_vcpu *vcpu); void kvm_inject_dabt(struct kvm_vcpu *vcpu, unsigned long addr); void kvm_inject_pabt(struct kvm_vcpu *vcpu, unsigned long addr); +static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu) +{ + vcpu-arch.hcr = HCR_GUEST_MASK; +} + static inline bool vcpu_mode_is_32bit(struct kvm_vcpu *vcpu) { return 1; diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c index edc1964..24c9ca4 100644 --- a/arch/arm/kvm/arm.c +++ b/arch/arm/kvm/arm.c @@ -658,6 +658,8 @@ static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu, if (ret) return ret; + vcpu_reset_hcr(vcpu); + /* * Handle the start in power-off case by marking the VCPU as paused. */ diff --git a/arch/arm/kvm/guest.c b/arch/arm/kvm/guest.c index cc0b787..8c97208 100644 --- a/arch/arm/kvm/guest.c +++ b/arch/arm/kvm/guest.c @@ -38,7 +38,6 @@ struct kvm_stats_debugfs_item debugfs_entries[] = { int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu) { - vcpu-arch.hcr = HCR_GUEST_MASK; return 0; } diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h index 5674a55..8127e45 100644 --- a/arch/arm64/include/asm/kvm_emulate.h +++ b/arch/arm64/include/asm/kvm_emulate.h @@ -38,6 +38,11 @@ void kvm_inject_undefined(struct kvm_vcpu *vcpu); void kvm_inject_dabt(struct kvm_vcpu *vcpu, unsigned long addr); void kvm_inject_pabt(struct kvm_vcpu *vcpu, unsigned long addr); +static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu) +{ + vcpu-arch.hcr_el2 = HCR_GUEST_FLAGS; +} + static inline unsigned long *vcpu_pc(const struct kvm_vcpu *vcpu) { return (unsigned long *)vcpu_gp_regs(vcpu)-regs.pc; diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c index 7679469..84d5959 100644 --- a/arch/arm64/kvm/guest.c +++ b/arch/arm64/kvm/guest.c @@ -38,7 +38,6 @@ struct kvm_stats_debugfs_item debugfs_entries[] = { int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu) { - vcpu-arch.hcr_el2 = HCR_GUEST_FLAGS; return 0; } Acked-by: Marc Zyngier marc.zyng...@arm.com M. -- Jazz is not dead. It just smells funny... -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 4/6] arm/arm64: KVM: Clarify KVM_ARM_VCPU_INIT ABI
On 03/12/14 21:18, Christoffer Dall wrote: It is not clear that this ioctl can be called multiple times for a given vcpu. Userspace already does this, so clarify the ABI. Also specify that userspace is expected to always make secondary and subsequent calls to the ioctl with the same parameters for the VCPU as the initial call (which userspace also already does). Add code to check that userspace doesn't violate that ABI in the future, and move the kvm_vcpu_set_target() function which is currently duplicated between the 32-bit and 64-bit versions in guest.c to a common static function in arm.c, shared between both architectures. Signed-off-by: Christoffer Dall christoffer.d...@linaro.org --- Documentation/virtual/kvm/api.txt | 5 + arch/arm/include/asm/kvm_host.h | 2 -- arch/arm/kvm/arm.c| 43 +++ arch/arm/kvm/guest.c | 25 --- arch/arm64/include/asm/kvm_host.h | 2 -- arch/arm64/kvm/guest.c| 25 --- 6 files changed, 48 insertions(+), 54 deletions(-) diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index bb82a90..81f1b97 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -2453,6 +2453,11 @@ return ENOEXEC for that vcpu. Note that because some registers reflect machine topology, all vcpus should be created before this ioctl is invoked. +Userspace can call this function multiple times for a given vcpu, including +after the vcpu has been run. This will reset the vcpu to its initial +state. All calls to this function after the initial call must use the same +target and same set of feature flags, otherwise EINVAL will be returned. + Possible features: - KVM_ARM_VCPU_POWER_OFF: Starts the CPU in a power-off state. Depends on KVM_CAP_ARM_PSCI. If not set, the CPU will be powered on diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h index 53036e2..254e065 100644 --- a/arch/arm/include/asm/kvm_host.h +++ b/arch/arm/include/asm/kvm_host.h @@ -150,8 +150,6 @@ struct kvm_vcpu_stat { u32 halt_wakeup; }; -int kvm_vcpu_set_target(struct kvm_vcpu *vcpu, - const struct kvm_vcpu_init *init); int kvm_vcpu_preferred_target(struct kvm_vcpu_init *init); unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu); int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *indices); diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c index 24c9ca4..4043769 100644 --- a/arch/arm/kvm/arm.c +++ b/arch/arm/kvm/arm.c @@ -263,6 +263,7 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu) { /* Force users to call KVM_ARM_VCPU_INIT */ vcpu-arch.target = -1; + bitmap_zero(vcpu-arch.features, KVM_VCPU_MAX_FEATURES); /* Set up the timer */ kvm_timer_vcpu_init(vcpu); @@ -649,6 +650,48 @@ int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_level, return -EINVAL; } +static int kvm_vcpu_set_target(struct kvm_vcpu *vcpu, +const struct kvm_vcpu_init *init) +{ + unsigned int i; + int phys_target = kvm_target_cpu(); + + if (init-target != phys_target) + return -EINVAL; + + /* + * Secondary and subsequent calls to KVM_ARM_VCPU_INIT must + * use the same target. + */ + if (vcpu-arch.target != -1 vcpu-arch.target != init-target) + return -EINVAL; + + /* -ENOENT for unknown features, -EINVAL for invalid combinations. */ + for (i = 0; i sizeof(init-features) * 8; i++) { + bool set = (init-features[i / 32] (1 (i % 32))); + + if (set i = KVM_VCPU_MAX_FEATURES) + return -ENOENT; + + /* + * Secondary and subsequent calls to KVM_ARM_VCPU_INIT must + * use the same feature set. + */ + if (vcpu-arch.target != -1 i KVM_VCPU_MAX_FEATURES + test_bit(i, vcpu-arch.features) != set) + return -EINVAL; + + if (set) + set_bit(i, vcpu-arch.features); + } + + vcpu-arch.target = phys_target; + + /* Now we know what it is, we can reset it. */ + return kvm_reset_vcpu(vcpu); +} + + static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu, struct kvm_vcpu_init *init) { diff --git a/arch/arm/kvm/guest.c b/arch/arm/kvm/guest.c index 8c97208..384bab6 100644 --- a/arch/arm/kvm/guest.c +++ b/arch/arm/kvm/guest.c @@ -273,31 +273,6 @@ int __attribute_const__ kvm_target_cpu(void) } } -int kvm_vcpu_set_target(struct kvm_vcpu *vcpu, - const struct kvm_vcpu_init *init) -{ - unsigned int i; - - /* We can only cope with guest==host and only on A15/A7 (for
Re: [PATCH v2 5/6] arm/arm64: KVM: Turn off vcpus on PSCI shutdown/reboot
On 03/12/14 21:18, Christoffer Dall wrote: When a vcpu calls SYSTEM_OFF or SYSTEM_RESET with PSCI v0.2, the vcpus should really be turned off for the VM adhering to the suggestions in the PSCI spec, and it's the sane thing to do. Also, clarify the behavior and expectations for exits to user space with the KVM_EXIT_SYSTEM_EVENT case. Signed-off-by: Christoffer Dall christoffer.d...@linaro.org --- Documentation/virtual/kvm/api.txt | 9 + arch/arm/kvm/psci.c | 19 +++ arch/arm64/include/asm/kvm_host.h | 1 + 3 files changed, 29 insertions(+) diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 81f1b97..228f9cf 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -2957,6 +2957,15 @@ HVC instruction based PSCI call from the vcpu. The 'type' field describes the system-level event type. The 'flags' field describes architecture specific flags for the system-level event. +Valid values for 'type' are: + KVM_SYSTEM_EVENT_SHUTDOWN -- the guest has requested a shutdown of the + VM. Userspace is not obliged to honour this, and if it does honour + this does not need to destroy the VM synchronously (ie it may call + KVM_RUN again before shutdown finally occurs). + KVM_SYSTEM_EVENT_RESET -- the guest has requested a reset of the VM. + As with SHUTDOWN, userspace can choose to ignore the request, or + to schedule the reset to occur in the future and may call KVM_RUN again. + /* Fix the size of the union. */ char padding[256]; }; diff --git a/arch/arm/kvm/psci.c b/arch/arm/kvm/psci.c index 09cf377..ae0bb91 100644 --- a/arch/arm/kvm/psci.c +++ b/arch/arm/kvm/psci.c @@ -15,6 +15,7 @@ * along with this program. If not, see http://www.gnu.org/licenses/. */ +#include linux/preempt.h #include linux/kvm_host.h #include linux/wait.h @@ -166,6 +167,24 @@ static unsigned long kvm_psci_vcpu_affinity_info(struct kvm_vcpu *vcpu) static void kvm_prepare_system_event(struct kvm_vcpu *vcpu, u32 type) { + int i; + struct kvm_vcpu *tmp; + + /* + * The KVM ABI specifies that a system event exit may call KVM_RUN + * again and may perform shutdown/reboot at a later time that when the + * actual request is made. Since we are implementing PSCI and a + * caller of PSCI reboot and shutdown expects that the system shuts + * down or reboots immediately, let's make sure that VCPUs are not run + * after this call is handled and before the VCPUs have been + * re-initialized. + */ + kvm_for_each_vcpu(i, tmp, vcpu-kvm) + tmp-arch.pause = true; + preempt_disable(); + force_vm_exit(cpu_all_mask); + preempt_enable(); + I'm slightly uneasy about this force_vm_exit, as this is something that is directly triggered by the guest. I suppose it is almost impossible to find out which CPUs we're actually using... memset(vcpu-run-system_event, 0, sizeof(vcpu-run-system_event)); vcpu-run-system_event.type = type; vcpu-run-exit_reason = KVM_EXIT_SYSTEM_EVENT; diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 65c6152..0b7dfdb 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -198,6 +198,7 @@ struct kvm_vcpu *kvm_arm_get_running_vcpu(void); struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void); u64 kvm_call_hyp(void *hypfn, ...); +void force_vm_exit(const cpumask_t *mask); int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run, int exception_index); Other than that, Acked-by: Marc Zyngier marc.zyng...@arm.com M. -- Jazz is not dead. It just smells funny... -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 6/6] arm/arm64: KVM: Introduce stage2_unmap_vm
On 03/12/14 21:18, Christoffer Dall wrote: Introduce a new function to unmap user RAM regions in the stage2 page tables. This is needed on reboot (or when the guest turns off the MMU) to ensure we fault in pages again and make the dcache, RAM, and icache coherent. Using unmap_stage2_range for the whole guest physical range does not work, because that unmaps IO regions (such as the GIC) which will not be recreated or in the best case faulted in on a page-by-page basis. Call this function on secondary and subsequent calls to the KVM_ARM_VCPU_INIT ioctl so that a reset VCPU will detect the guest Stage-1 MMU is off when faulting in pages and make the caches coherent. Signed-off-by: Christoffer Dall christoffer.d...@linaro.org --- arch/arm/include/asm/kvm_mmu.h | 1 + arch/arm/kvm/arm.c | 7 + arch/arm/kvm/mmu.c | 65 arch/arm64/include/asm/kvm_mmu.h | 1 + 4 files changed, 74 insertions(+) diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h index acb0d57..4654c42 100644 --- a/arch/arm/include/asm/kvm_mmu.h +++ b/arch/arm/include/asm/kvm_mmu.h @@ -52,6 +52,7 @@ int create_hyp_io_mappings(void *from, void *to, phys_addr_t); void free_boot_hyp_pgd(void); void free_hyp_pgds(void); +void stage2_unmap_vm(struct kvm *kvm); int kvm_alloc_stage2_pgd(struct kvm *kvm); void kvm_free_stage2_pgd(struct kvm *kvm); int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa, diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c index 4043769..da87c07 100644 --- a/arch/arm/kvm/arm.c +++ b/arch/arm/kvm/arm.c @@ -701,6 +701,13 @@ static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu, if (ret) return ret; + /* + * Ensure a rebooted VM will fault in RAM pages and detect if the + * guest MMU is turned off and flush the caches as needed. + */ + if (vcpu-arch.has_run_once) + stage2_unmap_vm(vcpu-kvm); + vcpu_reset_hcr(vcpu); /* diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c index 57a403a..b1f3c9a 100644 --- a/arch/arm/kvm/mmu.c +++ b/arch/arm/kvm/mmu.c @@ -611,6 +611,71 @@ static void unmap_stage2_range(struct kvm *kvm, phys_addr_t start, u64 size) unmap_range(kvm, kvm-arch.pgd, start, size); } +static void stage2_unmap_memslot(struct kvm *kvm, + struct kvm_memory_slot *memslot) +{ + hva_t hva = memslot-userspace_addr; + phys_addr_t addr = memslot-base_gfn PAGE_SHIFT; + phys_addr_t size = PAGE_SIZE * memslot-npages; + hva_t reg_end = hva + size; + + /* + * A memory region could potentially cover multiple VMAs, and any holes + * between them, so iterate over all of them to find out if we should + * unmap any of them. + * + * ++ + * +---++ ++ + * | : VMA 1 | VMA 2 | |VMA 3 :| + * +---++ ++ + * | memory region| + * ++ + */ + do { + struct vm_area_struct *vma = find_vma(current-mm, hva); + hva_t vm_start, vm_end; + + if (!vma || vma-vm_start = reg_end) + break; + + /* + * Take the intersection of this VMA with the memory region + */ + vm_start = max(hva, vma-vm_start); + vm_end = min(reg_end, vma-vm_end); + + if (!(vma-vm_flags VM_PFNMAP)) { + gpa_t gpa = addr + (vm_start - memslot-userspace_addr); + unmap_stage2_range(kvm, gpa, vm_end - vm_start); + } + hva = vm_end; + } while (hva reg_end); +} + +/** + * stage2_unmap_vm - Unmap Stage-2 RAM mappings + * @kvm: The struct kvm pointer + * + * Go through the memregions and unmap any reguler RAM + * backing memory already mapped to the VM. + */ +void stage2_unmap_vm(struct kvm *kvm) +{ + struct kvm_memslots *slots; + struct kvm_memory_slot *memslot; + int idx; + + idx = srcu_read_lock(kvm-srcu); + spin_lock(kvm-mmu_lock); + + slots = kvm_memslots(kvm); + kvm_for_each_memslot(memslot, slots) + stage2_unmap_memslot(kvm, memslot); + + spin_unlock(kvm-mmu_lock); + srcu_read_unlock(kvm-srcu, idx); +} + /** * kvm_free_stage2_pgd - free all stage-2 tables * @kvm: The KVM struct pointer for the VM. diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h index 0caf7a5..061fed7 100644 --- a/arch/arm64/include/asm/kvm_mmu.h +++ b/arch/arm64/include/asm/kvm_mmu.h @@ -83,6 +83,7 @@ int create_hyp_io_mappings(void *from, void
Re: [PATCH v3 7/8] KVM: kvm-vfio: generic forwarding control
On 11/25/2014 08:00 PM, Alex Williamson wrote: On Tue, 2014-11-25 at 19:20 +0100, Eric Auger wrote: On 11/24/2014 09:56 PM, Alex Williamson wrote: On Sun, 2014-11-23 at 19:35 +0100, Eric Auger wrote: This patch introduces a new KVM_DEV_VFIO_DEVICE group. This is a new control channel which enables KVM to cooperate with viable VFIO devices. Functions are introduced to check the validity of a VFIO device file descriptor, increment/decrement the ref counter of the VFIO device. The patch introduces 2 attributes for this new device group: KVM_DEV_VFIO_DEVICE_FORWARD_IRQ, KVM_DEV_VFIO_DEVICE_UNFORWARD_IRQ. Their purpose is to turn a VFIO device IRQ into a forwarded IRQ and unset respectively unset the feature. The VFIO device stores a list of registered forwarded IRQs. The reference counter of the device is incremented each time a new IRQ is forwarded. Reference counter is decremented when the IRQ forwarding is unset. The forwarding programmming is architecture specific, implemented in kvm_arch_set_fwd_state function. Architecture specific implementation is enabled when __KVM_HAVE_ARCH_KVM_VFIO_FORWARD is set. When not set those functions are void. Signed-off-by: Eric Auger eric.au...@linaro.org --- v2 - v3: - add API comments in kvm_host.h - improve the commit message - create a private kvm_vfio_fwd_irq struct - fwd_irq_action replaced by a bool and removal of VFIO_IRQ_CLEANUP. This latter action will be handled in vgic. - add a vfio_device handle argument to kvm_arch_set_fwd_state. The goal is to move platform specific stuff in architecture specific code. - kvm_arch_set_fwd_state renamed into kvm_arch_vfio_set_forward - increment the ref counter each time we do an IRQ forwarding and decrement this latter each time one IRQ forward is unset. Simplifies the whole ref counting. - simplification of list handling: create, search, removal v1 - v2: - __KVM_HAVE_ARCH_KVM_VFIO renamed into __KVM_HAVE_ARCH_KVM_VFIO_FORWARD - original patch file separated into 2 parts: generic part moved in vfio.c and ARM specific part(kvm_arch_set_fwd_state) --- include/linux/kvm_host.h | 28 ++ virt/kvm/vfio.c | 249 ++- 2 files changed, 274 insertions(+), 3 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index ea53b04..0b9659d 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1076,6 +1076,15 @@ struct kvm_device_ops { unsigned long arg); }; +/* internal self-contained structure describing a forwarded IRQ */ +struct kvm_fwd_irq { + struct kvm *kvm; /* VM to inject the GSI into */ + struct vfio_device *vdev; /* vfio device the IRQ belongs to */ + __u32 index; /* VFIO device IRQ index */ + __u32 subindex; /* VFIO device IRQ subindex */ + __u32 gsi; /* gsi, ie. virtual IRQ number */ +}; + void kvm_device_get(struct kvm_device *dev); void kvm_device_put(struct kvm_device *dev); struct kvm_device *kvm_device_from_filp(struct file *filp); @@ -1085,6 +1094,25 @@ void kvm_unregister_device_ops(u32 type); extern struct kvm_device_ops kvm_mpic_ops; extern struct kvm_device_ops kvm_xics_ops; +#ifdef __KVM_HAVE_ARCH_KVM_VFIO_FORWARD +/** + * kvm_arch_vfio_set_forward - changes the forwarded state of an IRQ + * + * @fwd_irq: handle to the forwarded irq struct + * @forward: true means forwarded, false means not forwarded + * returns 0 on success, 0 on failure + */ +int kvm_arch_vfio_set_forward(struct kvm_fwd_irq *fwd_irq, +bool forward); We could add a struct device* to the args list or into struct kvm_fwd_irq so that arch code doesn't need to touch the vdev. arch code has no business dealing with references to the vfio_device. Hi Alex, Currently It can't put struct device* into the kvm_fwd_irq struct since I need to release the vfio_device with vfio_device_put_external_user(struct vfio_device *vdev) typically in kvm_vfio_clean_fwd_irq. So I need to store the pointers to the vfio_device somewhere. I see 2 solutions: change the proto of vfio_device_put_external_user(struct vfio_device *vdev) and pass a struct device* (??) or change the proto of kvm_arch_vfio_set_forward into kvm_arch_vfio_set_forward(struct kvm *kvm, struct device *dev, int index, [int subindex], int gsi, bool forward) or using index/start/count but loosing the interest of having a self-contained internal struct. The latter is sort of what I was assuming, I think the interface between VFIO and KVM-VFIO is good, we just don't need to expose VFIO-isms out to the arch KVM code. KVM-VFIO should be the barrier layer. In that spirit, maybe it should be kvm_arch_set_forward() and the KVM-VFIO code should do the processing of index/subindex sort of like how Feng did for PCI devices. Hi Alex, In Feng's series, host irq is retrieved in the generic part while in mine it is retrieved in arch
Re: [PATCH v2 5/6] arm/arm64: KVM: Turn off vcpus on PSCI shutdown/reboot
On Mon, Dec 08, 2014 at 12:04:53PM +, Marc Zyngier wrote: On 03/12/14 21:18, Christoffer Dall wrote: When a vcpu calls SYSTEM_OFF or SYSTEM_RESET with PSCI v0.2, the vcpus should really be turned off for the VM adhering to the suggestions in the PSCI spec, and it's the sane thing to do. Also, clarify the behavior and expectations for exits to user space with the KVM_EXIT_SYSTEM_EVENT case. Signed-off-by: Christoffer Dall christoffer.d...@linaro.org --- Documentation/virtual/kvm/api.txt | 9 + arch/arm/kvm/psci.c | 19 +++ arch/arm64/include/asm/kvm_host.h | 1 + 3 files changed, 29 insertions(+) diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 81f1b97..228f9cf 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -2957,6 +2957,15 @@ HVC instruction based PSCI call from the vcpu. The 'type' field describes the system-level event type. The 'flags' field describes architecture specific flags for the system-level event. +Valid values for 'type' are: + KVM_SYSTEM_EVENT_SHUTDOWN -- the guest has requested a shutdown of the + VM. Userspace is not obliged to honour this, and if it does honour + this does not need to destroy the VM synchronously (ie it may call + KVM_RUN again before shutdown finally occurs). + KVM_SYSTEM_EVENT_RESET -- the guest has requested a reset of the VM. + As with SHUTDOWN, userspace can choose to ignore the request, or + to schedule the reset to occur in the future and may call KVM_RUN again. + /* Fix the size of the union. */ char padding[256]; }; diff --git a/arch/arm/kvm/psci.c b/arch/arm/kvm/psci.c index 09cf377..ae0bb91 100644 --- a/arch/arm/kvm/psci.c +++ b/arch/arm/kvm/psci.c @@ -15,6 +15,7 @@ * along with this program. If not, see http://www.gnu.org/licenses/. */ +#include linux/preempt.h #include linux/kvm_host.h #include linux/wait.h @@ -166,6 +167,24 @@ static unsigned long kvm_psci_vcpu_affinity_info(struct kvm_vcpu *vcpu) static void kvm_prepare_system_event(struct kvm_vcpu *vcpu, u32 type) { + int i; + struct kvm_vcpu *tmp; + + /* +* The KVM ABI specifies that a system event exit may call KVM_RUN +* again and may perform shutdown/reboot at a later time that when the +* actual request is made. Since we are implementing PSCI and a +* caller of PSCI reboot and shutdown expects that the system shuts +* down or reboots immediately, let's make sure that VCPUs are not run +* after this call is handled and before the VCPUs have been +* re-initialized. +*/ + kvm_for_each_vcpu(i, tmp, vcpu-kvm) + tmp-arch.pause = true; + preempt_disable(); + force_vm_exit(cpu_all_mask); + preempt_enable(); + I'm slightly uneasy about this force_vm_exit, as this is something that is directly triggered by the guest. I suppose it is almost impossible to find out which CPUs we're actually using... Ah, you mean we should only IPI the CPUs that are actually running a VCPU belonging to this VM? I guess I could replace it with: kvm_for_each_vcpu(i, tmp, vcpu-kvm) { tmp-arch.pause = true; kvm_vcpu_kick(tmp); } or a slightly more optimized half-open-coded-kvm_vcpu_kick: me = get_cpu(); kvm_for_each_vcpu(i, tmp, vcpu-kvm) { tmp-arch.pause = true; if (tmp-cpu != me (unsigned)tmp-cpu nr_cpu_ids cpu_online(tmp-cpu) kvm_arch_vcpu_should_kick(tmp)) smp_send_reschedule(tmp-cpu); } which should save us waking up vcpu threads that are parked on waitqueues. Not sure it's worth it, maybe it is for 100s of vcpu systems? Can we actually replace force_vm_exit() with the more optimized open-coded version? That messes with VMID allocation so it really needs a lot of testing though... Preferences? -Christoffer -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 5/6] arm/arm64: KVM: Turn off vcpus on PSCI shutdown/reboot
On 08/12/14 12:58, Christoffer Dall wrote: On Mon, Dec 08, 2014 at 12:04:53PM +, Marc Zyngier wrote: On 03/12/14 21:18, Christoffer Dall wrote: When a vcpu calls SYSTEM_OFF or SYSTEM_RESET with PSCI v0.2, the vcpus should really be turned off for the VM adhering to the suggestions in the PSCI spec, and it's the sane thing to do. Also, clarify the behavior and expectations for exits to user space with the KVM_EXIT_SYSTEM_EVENT case. Signed-off-by: Christoffer Dall christoffer.d...@linaro.org --- Documentation/virtual/kvm/api.txt | 9 + arch/arm/kvm/psci.c | 19 +++ arch/arm64/include/asm/kvm_host.h | 1 + 3 files changed, 29 insertions(+) diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 81f1b97..228f9cf 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -2957,6 +2957,15 @@ HVC instruction based PSCI call from the vcpu. The 'type' field describes the system-level event type. The 'flags' field describes architecture specific flags for the system-level event. +Valid values for 'type' are: + KVM_SYSTEM_EVENT_SHUTDOWN -- the guest has requested a shutdown of the + VM. Userspace is not obliged to honour this, and if it does honour + this does not need to destroy the VM synchronously (ie it may call + KVM_RUN again before shutdown finally occurs). + KVM_SYSTEM_EVENT_RESET -- the guest has requested a reset of the VM. + As with SHUTDOWN, userspace can choose to ignore the request, or + to schedule the reset to occur in the future and may call KVM_RUN again. + /* Fix the size of the union. */ char padding[256]; }; diff --git a/arch/arm/kvm/psci.c b/arch/arm/kvm/psci.c index 09cf377..ae0bb91 100644 --- a/arch/arm/kvm/psci.c +++ b/arch/arm/kvm/psci.c @@ -15,6 +15,7 @@ * along with this program. If not, see http://www.gnu.org/licenses/. */ +#include linux/preempt.h #include linux/kvm_host.h #include linux/wait.h @@ -166,6 +167,24 @@ static unsigned long kvm_psci_vcpu_affinity_info(struct kvm_vcpu *vcpu) static void kvm_prepare_system_event(struct kvm_vcpu *vcpu, u32 type) { + int i; + struct kvm_vcpu *tmp; + + /* +* The KVM ABI specifies that a system event exit may call KVM_RUN +* again and may perform shutdown/reboot at a later time that when the +* actual request is made. Since we are implementing PSCI and a +* caller of PSCI reboot and shutdown expects that the system shuts +* down or reboots immediately, let's make sure that VCPUs are not run +* after this call is handled and before the VCPUs have been +* re-initialized. +*/ + kvm_for_each_vcpu(i, tmp, vcpu-kvm) + tmp-arch.pause = true; + preempt_disable(); + force_vm_exit(cpu_all_mask); + preempt_enable(); + I'm slightly uneasy about this force_vm_exit, as this is something that is directly triggered by the guest. I suppose it is almost impossible to find out which CPUs we're actually using... Ah, you mean we should only IPI the CPUs that are actually running a VCPU belonging to this VM? I guess I could replace it with: kvm_for_each_vcpu(i, tmp, vcpu-kvm) { tmp-arch.pause = true; kvm_vcpu_kick(tmp); } Ah, that's even simpler than I thought. Yeah, looks good to me. or a slightly more optimized half-open-coded-kvm_vcpu_kick: me = get_cpu(); kvm_for_each_vcpu(i, tmp, vcpu-kvm) { tmp-arch.pause = true; if (tmp-cpu != me (unsigned)tmp-cpu nr_cpu_ids cpu_online(tmp-cpu) kvm_arch_vcpu_should_kick(tmp)) smp_send_reschedule(tmp-cpu); } which should save us waking up vcpu threads that are parked on waitqueues. Not sure it's worth it, maybe it is for 100s of vcpu systems? Probably not worth it at the moment. Can we actually replace force_vm_exit() with the more optimized open-coded version? That messes with VMID allocation so it really needs a lot of testing though... VMID reallocation almost never occurs, and that's a system-wide event, not triggered by a guest. I'd rather not mess with that just yet. Preferences? I think your first version is very nice, provided that it doesn't introduce any unforeseen regression. Thanks, M. -- Jazz is not dead. It just smells funny... -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [v2 00/25] Add VT-d Posted-Interrupts support
Ping... Thanks, Feng -Original Message- From: Wu, Feng Sent: Wednesday, December 03, 2014 3:39 PM To: t...@linutronix.de; mi...@redhat.com; h...@zytor.com; x...@kernel.org; g...@kernel.org; pbonz...@redhat.com; dw...@infradead.org; j...@8bytes.org; alex.william...@redhat.com; jiang@linux.intel.com Cc: linux-ker...@vger.kernel.org; io...@lists.linux-foundation.org; kvm@vger.kernel.org; Wu, Feng Subject: [v2 00/25] Add VT-d Posted-Interrupts support VT-d Posted-Interrupts is an enhancement to CPU side Posted-Interrupt. With VT-d Posted-Interrupts enabled, external interrupts from direct-assigned devices can be delivered to guests without VMM intervention when guest is running in non-root mode. You can find the VT-d Posted-Interrtups Spec. in the following URL: http://www.intel.com/content/www/us/en/intelligent-systems/intel-technolog y/vt-directed-io-spec.html v1-v2: * Use VFIO framework to enable this feature, the VFIO part of this series is base on Eric's patch [PATCH v3 0/8] KVM-VFIO IRQ forward control * Rebase this patchset on git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git, then revise some irq logic based on the new hierarchy irqdomain patches provided by Jiang Liu jiang@linux.intel.com This patch series is made of the following groups: 1-6: Some preparation changes in iommu and irq component, this is based on the new hierarchy irqdomain logic. 7-9, 25: IOMMU changes for VT-d Posted-Interrupts, such as, feature detection, command line parameter. 10-16, 21-24: Changes related to KVM itself. 17-19: Changes in VFIO component, this part was previously sent out as [RFC PATCH v2 0/2] kvm-vfio: implement the vfio skeleton for VT-d Posted-Interrupts 20: x86 irq related changes Feng Wu (25): genirq: Introduce irq_set_vcpu_affinity() to target an interrupt to a VCPU iommu: Add new member capability to struct irq_remap_ops iommu, x86: Define new irte structure for VT-d Posted-Interrupts iommu, x86: Implement irq_set_vcpu_affinity for intel_ir_chip x86, irq: Implement irq_set_vcpu_affinity for pci_msi_ir_controller iommu, x86: No need to migrating irq for VT-d Posted-Interrupts iommu, x86: Add cap_pi_support() to detect VT-d PI capability iommu, x86: Add intel_irq_remapping_capability() for Intel iommu, x86: define irq_remapping_cap() KVM: change struct pi_desc for VT-d Posted-Interrupts KVM: Add some helper functions for Posted-Interrupts KVM: Initialize VT-d Posted-Interrupts Descriptor KVM: Define a new interface kvm_find_dest_vcpu() for VT-d PI KVM: Get Posted-Interrupts descriptor address from struct kvm_vcpu KVM: Make struct kvm_irq_routing_table accessible KVM: make kvm_set_msi_irq() public KVM: kvm-vfio: User API for VT-d Posted-Interrupts KVM: kvm-vfio: implement the VFIO skeleton for VT-d Posted-Interrupts KVM: x86: kvm-vfio: VT-d posted-interrupts setup x86, irq: Define a global vector for VT-d Posted-Interrupts KVM: Update Posted-Interrupts descriptor during vCPU scheduling KVM: Change NDST field after vCPU scheduling KVM: Add the handler for Wake-up Vector KVM: Suppress posted-interrupt when 'SN' is set iommu/vt-d: Add a command line parameter for VT-d posted-interrupts Documentation/kernel-parameters.txt|1 + Documentation/virtual/kvm/devices/vfio.txt |9 + arch/x86/include/asm/entry_arch.h |2 + arch/x86/include/asm/hardirq.h |1 + arch/x86/include/asm/hw_irq.h |2 + arch/x86/include/asm/irq_remapping.h | 11 ++ arch/x86/include/asm/irq_vectors.h |1 + arch/x86/include/asm/kvm_host.h| 14 ++ arch/x86/kernel/apic/msi.c |1 + arch/x86/kernel/entry_64.S |2 + arch/x86/kernel/irq.c | 27 +++ arch/x86/kernel/irqinit.c |2 + arch/x86/kvm/Makefile |2 +- arch/x86/kvm/kvm_vfio_x86.c| 68 arch/x86/kvm/vmx.c | 251 +++- arch/x86/kvm/x86.c | 38 - drivers/iommu/intel_irq_remapping.c| 64 +++ drivers/iommu/irq_remapping.c | 24 +++- drivers/iommu/irq_remapping.h |8 + include/linux/dmar.h | 32 include/linux/intel-iommu.h|1 + include/linux/irq.h|7 + include/linux/kvm_host.h | 43 + include/uapi/linux/kvm.h | 10 + kernel/irq/chip.c | 14 ++ kernel/irq/manage.c| 20 +++ virt/kvm/irq_comm.c| 43 +- virt/kvm/irqchip.c | 11 -- virt/kvm/kvm_main.c| 14 ++ virt/kvm/vfio.c| 103
[Bug 89221] Heavy workload bewteen two KVM guest using vxlan tunnel stuck
https://bugzilla.kernel.org/show_bug.cgi?id=89221 Alan a...@lxorguk.ukuu.org.uk changed: What|Removed |Added CC||a...@lxorguk.ukuu.org.uk Component|IPV4|kvm Assignee|shemminger@linux-foundation |virtualization_kvm@kernel-b |.org|ugs.osdl.org Product|Networking |Virtualization -- You are receiving this mail because: You are watching the assignee of the bug. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] x86, kvm: Clear paravirt_enabled on KVM guests for espfix32's benefit
On Fri, Dec 05, 2014 at 07:03:28PM -0800, Andy Lutomirski wrote: paravirt_enabled has the following effects: - Disables the F00F bug workaround warning. There is no F00F bug workaround any more because Linux's standard IDT handling already works around the F00F bug, but the warning still exists. This is only cosmetic, and, in any event, there is no such thing as KVM on a CPU with the F00F bug. - Disables 32-bit APM BIOS detection. On a KVM paravirt system, there should be no APM BIOS anyway. - Disables tboot. I think that the tboot code should check the CPUID hypervisor bit directly if it matters. - paravirt_enabled disables espfix32. espfix32 should *not* be disabled under KVM paravirt. The last point is the purpose of this patch. It fixes a leak of the high 16 bits of the kernel stack address on 32-bit KVM paravirt guests. While I'm at it, this removes pv_info setup from kvmclock. That code seems to serve no purpose. Cc: sta...@vger.kernel.org Signed-off-by: Andy Lutomirski l...@amacapital.net Suggested-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com --- arch/x86/kernel/kvm.c | 9 - arch/x86/kernel/kvmclock.c | 2 -- 2 files changed, 8 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index f6945bef2cd1..94f643484300 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -283,7 +283,14 @@ NOKPROBE_SYMBOL(do_async_page_fault); static void __init paravirt_ops_setup(void) { pv_info.name = KVM; - pv_info.paravirt_enabled = 1; + + /* + * KVM isn't paravirt in the sense of paravirt_enabled. A KVM + * guest kernel works like a bare metal kernel with additional + * features, and paravirt_enabled is about features that are + * missing. + */ + pv_info.paravirt_enabled = 0; if (kvm_para_has_feature(KVM_FEATURE_NOP_IO_DELAY)) pv_cpu_ops.io_delay = kvm_io_delay; diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c index d9156ceecdff..d4d9a8ad7893 100644 --- a/arch/x86/kernel/kvmclock.c +++ b/arch/x86/kernel/kvmclock.c @@ -263,8 +263,6 @@ void __init kvmclock_init(void) #endif kvm_get_preset_lpj(); clocksource_register_hz(kvm_clock, NSEC_PER_SEC); - pv_info.paravirt_enabled = 1; - pv_info.name = KVM; if (kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE_STABLE_BIT)) pvclock_set_flags(PVCLOCK_TSC_STABLE_BIT); -- 1.9.3 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] x86, kvm: Clear paravirt_enabled on KVM guests for espfix32's benefit
On Mon, Dec 8, 2014 at 7:45 AM, Konrad Rzeszutek Wilk konrad.w...@oracle.com wrote: On Fri, Dec 05, 2014 at 07:03:28PM -0800, Andy Lutomirski wrote: paravirt_enabled has the following effects: - Disables the F00F bug workaround warning. There is no F00F bug workaround any more because Linux's standard IDT handling already works around the F00F bug, but the warning still exists. This is only cosmetic, and, in any event, there is no such thing as KVM on a CPU with the F00F bug. - Disables 32-bit APM BIOS detection. On a KVM paravirt system, there should be no APM BIOS anyway. - Disables tboot. I think that the tboot code should check the CPUID hypervisor bit directly if it matters. - paravirt_enabled disables espfix32. espfix32 should *not* be disabled under KVM paravirt. The last point is the purpose of this patch. It fixes a leak of the high 16 bits of the kernel stack address on 32-bit KVM paravirt guests. While I'm at it, this removes pv_info setup from kvmclock. That code seems to serve no purpose. Cc: sta...@vger.kernel.org Signed-off-by: Andy Lutomirski l...@amacapital.net Suggested-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com Sorry, meant to add that but forgot. Too many patches late last week :( --- arch/x86/kernel/kvm.c | 9 - arch/x86/kernel/kvmclock.c | 2 -- 2 files changed, 8 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index f6945bef2cd1..94f643484300 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -283,7 +283,14 @@ NOKPROBE_SYMBOL(do_async_page_fault); static void __init paravirt_ops_setup(void) { pv_info.name = KVM; - pv_info.paravirt_enabled = 1; + + /* + * KVM isn't paravirt in the sense of paravirt_enabled. A KVM + * guest kernel works like a bare metal kernel with additional + * features, and paravirt_enabled is about features that are + * missing. + */ + pv_info.paravirt_enabled = 0; if (kvm_para_has_feature(KVM_FEATURE_NOP_IO_DELAY)) pv_cpu_ops.io_delay = kvm_io_delay; diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c index d9156ceecdff..d4d9a8ad7893 100644 --- a/arch/x86/kernel/kvmclock.c +++ b/arch/x86/kernel/kvmclock.c @@ -263,8 +263,6 @@ void __init kvmclock_init(void) #endif kvm_get_preset_lpj(); clocksource_register_hz(kvm_clock, NSEC_PER_SEC); - pv_info.paravirt_enabled = 1; - pv_info.name = KVM; if (kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE_STABLE_BIT)) pvclock_set_flags(PVCLOCK_TSC_STABLE_BIT); -- 1.9.3 -- Andy Lutomirski AMA Capital Management, LLC -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 7/8] KVM: kvm-vfio: generic forwarding control
On Mon, 2014-12-08 at 13:22 +0100, Eric Auger wrote: On 11/25/2014 08:00 PM, Alex Williamson wrote: On Tue, 2014-11-25 at 19:20 +0100, Eric Auger wrote: On 11/24/2014 09:56 PM, Alex Williamson wrote: On Sun, 2014-11-23 at 19:35 +0100, Eric Auger wrote: This patch introduces a new KVM_DEV_VFIO_DEVICE group. This is a new control channel which enables KVM to cooperate with viable VFIO devices. Functions are introduced to check the validity of a VFIO device file descriptor, increment/decrement the ref counter of the VFIO device. The patch introduces 2 attributes for this new device group: KVM_DEV_VFIO_DEVICE_FORWARD_IRQ, KVM_DEV_VFIO_DEVICE_UNFORWARD_IRQ. Their purpose is to turn a VFIO device IRQ into a forwarded IRQ and unset respectively unset the feature. The VFIO device stores a list of registered forwarded IRQs. The reference counter of the device is incremented each time a new IRQ is forwarded. Reference counter is decremented when the IRQ forwarding is unset. The forwarding programmming is architecture specific, implemented in kvm_arch_set_fwd_state function. Architecture specific implementation is enabled when __KVM_HAVE_ARCH_KVM_VFIO_FORWARD is set. When not set those functions are void. Signed-off-by: Eric Auger eric.au...@linaro.org --- v2 - v3: - add API comments in kvm_host.h - improve the commit message - create a private kvm_vfio_fwd_irq struct - fwd_irq_action replaced by a bool and removal of VFIO_IRQ_CLEANUP. This latter action will be handled in vgic. - add a vfio_device handle argument to kvm_arch_set_fwd_state. The goal is to move platform specific stuff in architecture specific code. - kvm_arch_set_fwd_state renamed into kvm_arch_vfio_set_forward - increment the ref counter each time we do an IRQ forwarding and decrement this latter each time one IRQ forward is unset. Simplifies the whole ref counting. - simplification of list handling: create, search, removal v1 - v2: - __KVM_HAVE_ARCH_KVM_VFIO renamed into __KVM_HAVE_ARCH_KVM_VFIO_FORWARD - original patch file separated into 2 parts: generic part moved in vfio.c and ARM specific part(kvm_arch_set_fwd_state) --- include/linux/kvm_host.h | 28 ++ virt/kvm/vfio.c | 249 ++- 2 files changed, 274 insertions(+), 3 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index ea53b04..0b9659d 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1076,6 +1076,15 @@ struct kvm_device_ops { unsigned long arg); }; +/* internal self-contained structure describing a forwarded IRQ */ +struct kvm_fwd_irq { +struct kvm *kvm; /* VM to inject the GSI into */ +struct vfio_device *vdev; /* vfio device the IRQ belongs to */ +__u32 index; /* VFIO device IRQ index */ +__u32 subindex; /* VFIO device IRQ subindex */ +__u32 gsi; /* gsi, ie. virtual IRQ number */ +}; + void kvm_device_get(struct kvm_device *dev); void kvm_device_put(struct kvm_device *dev); struct kvm_device *kvm_device_from_filp(struct file *filp); @@ -1085,6 +1094,25 @@ void kvm_unregister_device_ops(u32 type); extern struct kvm_device_ops kvm_mpic_ops; extern struct kvm_device_ops kvm_xics_ops; +#ifdef __KVM_HAVE_ARCH_KVM_VFIO_FORWARD +/** + * kvm_arch_vfio_set_forward - changes the forwarded state of an IRQ + * + * @fwd_irq: handle to the forwarded irq struct + * @forward: true means forwarded, false means not forwarded + * returns 0 on success, 0 on failure + */ +int kvm_arch_vfio_set_forward(struct kvm_fwd_irq *fwd_irq, + bool forward); We could add a struct device* to the args list or into struct kvm_fwd_irq so that arch code doesn't need to touch the vdev. arch code has no business dealing with references to the vfio_device. Hi Alex, Currently It can't put struct device* into the kvm_fwd_irq struct since I need to release the vfio_device with vfio_device_put_external_user(struct vfio_device *vdev) typically in kvm_vfio_clean_fwd_irq. So I need to store the pointers to the vfio_device somewhere. I see 2 solutions: change the proto of vfio_device_put_external_user(struct vfio_device *vdev) and pass a struct device* (??) or change the proto of kvm_arch_vfio_set_forward into kvm_arch_vfio_set_forward(struct kvm *kvm, struct device *dev, int index, [int subindex], int gsi, bool forward) or using index/start/count but loosing the interest of having a self-contained internal struct. The latter is sort of what I was assuming, I think the interface between VFIO and KVM-VFIO is good, we just don't need to expose VFIO-isms out to the arch KVM code. KVM-VFIO should be the barrier layer. In that spirit, maybe it should be kvm_arch_set_forward() and
Re: [PATCH v3 7/8] KVM: kvm-vfio: generic forwarding control
On 12/08/2014 05:54 PM, Alex Williamson wrote: On Mon, 2014-12-08 at 13:22 +0100, Eric Auger wrote: On 11/25/2014 08:00 PM, Alex Williamson wrote: On Tue, 2014-11-25 at 19:20 +0100, Eric Auger wrote: On 11/24/2014 09:56 PM, Alex Williamson wrote: On Sun, 2014-11-23 at 19:35 +0100, Eric Auger wrote: This patch introduces a new KVM_DEV_VFIO_DEVICE group. This is a new control channel which enables KVM to cooperate with viable VFIO devices. Functions are introduced to check the validity of a VFIO device file descriptor, increment/decrement the ref counter of the VFIO device. The patch introduces 2 attributes for this new device group: KVM_DEV_VFIO_DEVICE_FORWARD_IRQ, KVM_DEV_VFIO_DEVICE_UNFORWARD_IRQ. Their purpose is to turn a VFIO device IRQ into a forwarded IRQ and unset respectively unset the feature. The VFIO device stores a list of registered forwarded IRQs. The reference counter of the device is incremented each time a new IRQ is forwarded. Reference counter is decremented when the IRQ forwarding is unset. The forwarding programmming is architecture specific, implemented in kvm_arch_set_fwd_state function. Architecture specific implementation is enabled when __KVM_HAVE_ARCH_KVM_VFIO_FORWARD is set. When not set those functions are void. Signed-off-by: Eric Auger eric.au...@linaro.org --- v2 - v3: - add API comments in kvm_host.h - improve the commit message - create a private kvm_vfio_fwd_irq struct - fwd_irq_action replaced by a bool and removal of VFIO_IRQ_CLEANUP. This latter action will be handled in vgic. - add a vfio_device handle argument to kvm_arch_set_fwd_state. The goal is to move platform specific stuff in architecture specific code. - kvm_arch_set_fwd_state renamed into kvm_arch_vfio_set_forward - increment the ref counter each time we do an IRQ forwarding and decrement this latter each time one IRQ forward is unset. Simplifies the whole ref counting. - simplification of list handling: create, search, removal v1 - v2: - __KVM_HAVE_ARCH_KVM_VFIO renamed into __KVM_HAVE_ARCH_KVM_VFIO_FORWARD - original patch file separated into 2 parts: generic part moved in vfio.c and ARM specific part(kvm_arch_set_fwd_state) --- include/linux/kvm_host.h | 28 ++ virt/kvm/vfio.c | 249 ++- 2 files changed, 274 insertions(+), 3 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index ea53b04..0b9659d 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1076,6 +1076,15 @@ struct kvm_device_ops { unsigned long arg); }; +/* internal self-contained structure describing a forwarded IRQ */ +struct kvm_fwd_irq { +struct kvm *kvm; /* VM to inject the GSI into */ +struct vfio_device *vdev; /* vfio device the IRQ belongs to */ +__u32 index; /* VFIO device IRQ index */ +__u32 subindex; /* VFIO device IRQ subindex */ +__u32 gsi; /* gsi, ie. virtual IRQ number */ +}; + void kvm_device_get(struct kvm_device *dev); void kvm_device_put(struct kvm_device *dev); struct kvm_device *kvm_device_from_filp(struct file *filp); @@ -1085,6 +1094,25 @@ void kvm_unregister_device_ops(u32 type); extern struct kvm_device_ops kvm_mpic_ops; extern struct kvm_device_ops kvm_xics_ops; +#ifdef __KVM_HAVE_ARCH_KVM_VFIO_FORWARD +/** + * kvm_arch_vfio_set_forward - changes the forwarded state of an IRQ + * + * @fwd_irq: handle to the forwarded irq struct + * @forward: true means forwarded, false means not forwarded + * returns 0 on success, 0 on failure + */ +int kvm_arch_vfio_set_forward(struct kvm_fwd_irq *fwd_irq, + bool forward); We could add a struct device* to the args list or into struct kvm_fwd_irq so that arch code doesn't need to touch the vdev. arch code has no business dealing with references to the vfio_device. Hi Alex, Currently It can't put struct device* into the kvm_fwd_irq struct since I need to release the vfio_device with vfio_device_put_external_user(struct vfio_device *vdev) typically in kvm_vfio_clean_fwd_irq. So I need to store the pointers to the vfio_device somewhere. I see 2 solutions: change the proto of vfio_device_put_external_user(struct vfio_device *vdev) and pass a struct device* (??) or change the proto of kvm_arch_vfio_set_forward into kvm_arch_vfio_set_forward(struct kvm *kvm, struct device *dev, int index, [int subindex], int gsi, bool forward) or using index/start/count but loosing the interest of having a self-contained internal struct. The latter is sort of what I was assuming, I think the interface between VFIO and KVM-VFIO is good, we just don't need to expose VFIO-isms out to the arch KVM code. KVM-VFIO should be the barrier layer. In that spirit, maybe it should be kvm_arch_set_forward() and the KVM-VFIO code should do the processing of index/subindex sort of
Re: usb audio device troubles
On 12/08/2014 05:43 AM, Hans de Goede wrote: Hi Eric, On 03-12-14 16:39, Eric S. Johansson wrote: On 12/3/2014 3:52 AM, Hans de Goede wrote: Eric are you using usb-host redirection, or Spice's usb network redir ? This little bit of time this morning learning about spice and the network redirection. It worked for about half an hour and then failed in the same way the host redirection failed. The audio device would appear for a while, I would try to use it and then it would disappear. The spice model has some very nice features and that I could, in theory, have a working speech recognition engine somewhere on my air quotescloud/air quotes and then be able to use it via spice on any desktop I happen to be located in front of. it would also work nicely with my original idea of putting a working KVM virtual machine on and an e-sata SSD external drive and be able to bring my working speech recognition environment with me without having cart a laptop. I hope you can see that this could be generalized into a nicely portable accessibility solution where the accessibility environment moves with the disabled user and removes the need to make every machine have user specific accessibility software and configuration. Yes, it does impose a requirement the KVM runs everywhere but, we know that's the future anyway so why fight it :-) Anyway, I think if we can solve this USB audio device problem then I'll be very happy and can make further progress towards my goal. Thank you so very much for the help so far and I hope we can fix this USB problem. To further figure out what is going on when the usb device disconnects we will need some logs. For starters lets look at the spice-client side, before starting virt-manager or virt-viewer do the following in the terminal: export LIBUSB_DEBUG=4 And then start the application from the terminal like e.g. this: virt-manager virt-man.log You'll need to use virt-manager --no-fork or virt-manager --debug to actually see any output - Cole -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 2/2] x86, arm64, platform, xen, kconfig: add xen defconfig helper
From: Luis R. Rodriguez mcg...@suse.com This lets you build a kernel which can support xen dom0 or xen guests by just using: make xenconfig on both x86 and arm64 kernels. This also splits out the options which are available currently to be built with x86 and 'make ARCH=arm64' under a shared config. Technically xen supports a dom0 kernel and also a guest kernel configuration but upon review with the xen team since we don't have many dom0 options its best to just combine these two into one. Cc: Josh Triplett j...@joshtriplett.org Cc: Borislav Petkov b...@suse.de Cc: Pekka Enberg penb...@kernel.org Cc: David Rientjes rient...@google.com Cc: Michal Marek mma...@suse.cz Cc: Randy Dunlap rdun...@infradead.org Cc: penb...@kernel.org Cc: levinsasha...@gmail.com Cc: mtosa...@redhat.com Cc: fengguang...@intel.com Cc: David Vrabel david.vra...@citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com Cc: xen-de...@lists.xenproject.org Signed-off-by: Luis R. Rodriguez mcg...@suse.com --- arch/x86/configs/xen.config | 6 ++ kernel/configs/xen.config | 32 scripts/kconfig/Makefile| 5 + 3 files changed, 43 insertions(+) create mode 100644 arch/x86/configs/xen.config create mode 100644 kernel/configs/xen.config diff --git a/arch/x86/configs/xen.config b/arch/x86/configs/xen.config new file mode 100644 index 000..b97e893 --- /dev/null +++ b/arch/x86/configs/xen.config @@ -0,0 +1,6 @@ +# x86 xen specific config options +CONFIG_XEN_PVHVM=y +CONFIG_XEN_MAX_DOMAIN_MEMORY=500 +CONFIG_XEN_SAVE_RESTORE=y +# CONFIG_XEN_DEBUG_FS is not set +CONFIG_XEN_PVH=y diff --git a/kernel/configs/xen.config b/kernel/configs/xen.config new file mode 100644 index 000..0d0eb6d --- /dev/null +++ b/kernel/configs/xen.config @@ -0,0 +1,32 @@ +# generic config +CONFIG_XEN=y +CONFIG_XEN_DOM0=y +CONFIG_PCI_XEN=y +CONFIG_XEN_PCIDEV_FRONTEND=m +CONFIG_XEN_BLKDEV_FRONTEND=m +CONFIG_XEN_BLKDEV_BACKEND=m +CONFIG_XEN_NETDEV_FRONTEND=m +CONFIG_XEN_NETDEV_BACKEND=m +CONFIG_INPUT_XEN_KBDDEV_FRONTEND=y +CONFIG_HVC_XEN=y +CONFIG_HVC_XEN_FRONTEND=y +CONFIG_TCG_XEN=m +CONFIG_XEN_WDT=m +CONFIG_XEN_FBDEV_FRONTEND=y +CONFIG_XEN_BALLOON=y +CONFIG_XEN_BALLOON_MEMORY_HOTPLUG=y +CONFIG_XEN_SCRUB_PAGES=y +CONFIG_XEN_DEV_EVTCHN=m +CONFIG_XEN_BACKEND=y +CONFIG_XENFS=m +CONFIG_XEN_COMPAT_XENFS=y +CONFIG_XEN_SYS_HYPERVISOR=y +CONFIG_XEN_XENBUS_FRONTEND=y +CONFIG_XEN_GNTDEV=m +CONFIG_XEN_GRANT_DEV_ALLOC=m +CONFIG_SWIOTLB_XEN=y +CONFIG_XEN_PCIDEV_BACKEND=m +CONFIG_XEN_PRIVCMD=m +CONFIG_XEN_ACPI_PROCESSOR=m +CONFIG_XEN_MCE_LOG=y +CONFIG_XEN_HAVE_PVMMU=y diff --git a/scripts/kconfig/Makefile b/scripts/kconfig/Makefile index ff612b0..f4a8f89 100644 --- a/scripts/kconfig/Makefile +++ b/scripts/kconfig/Makefile @@ -117,6 +117,10 @@ PHONY += kvmconfig kvmconfig: $(call mergeconfig,kvm_guest) +PHONY += xenconfig +xenconfig: + $(call mergeconfig,xen) + PHONY += tinyconfig tinyconfig: allnoconfig $(call mergeconfig,tiny) @@ -142,6 +146,7 @@ help: @echo ' listnewconfig - List new options' @echo ' olddefconfig- Same as silentoldconfig but sets new symbols to their default value' @echo ' kvmconfig - Enable additional options for kvm guest kernel support' + @echo ' xenconfig - Enable additional options for xen dom0 and guest kernel support' @echo ' tinyconfig - Configure the tiniest possible kernel' # lxdialog stuff -- 2.1.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 1/2] x86, platform, kconfig: clarify kvmconfig is for kvm
From: Luis R. Rodriguez mcg...@suse.com We'll be adding options for xen as well. Cc: Josh Triplett j...@joshtriplett.org Cc: Borislav Petkov b...@suse.de Cc: Pekka Enberg penb...@kernel.org Cc: David Rientjes rient...@google.com Cc: Michal Marek mma...@suse.cz Cc: Randy Dunlap rdun...@infradead.org Cc: penb...@kernel.org Cc: levinsasha...@gmail.com Cc: mtosa...@redhat.com Cc: fengguang...@intel.com Cc: David Vrabel david.vra...@citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com Cc: xen-de...@lists.xenproject.org Acked-by: David Rientjes rient...@google.com Signed-off-by: Luis R. Rodriguez mcg...@suse.com --- scripts/kconfig/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/kconfig/Makefile b/scripts/kconfig/Makefile index 9645c07..ff612b0 100644 --- a/scripts/kconfig/Makefile +++ b/scripts/kconfig/Makefile @@ -141,7 +141,7 @@ help: @echo ' randconfig - New config with random answer to all options' @echo ' listnewconfig - List new options' @echo ' olddefconfig- Same as silentoldconfig but sets new symbols to their default value' - @echo ' kvmconfig - Enable additional options for guest kernel support' + @echo ' kvmconfig - Enable additional options for kvm guest kernel support' @echo ' tinyconfig - Configure the tiniest possible kernel' # lxdialog stuff -- 2.1.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 0/2]: x86/arm64: add xenconfig
From: Luis R. Rodriguez mcg...@suse.com This is based on some old set I had lying around. The virtconfig changes I had proposed a while ago got merged and reused for tinyconfig, this adapts my original set to use the new mergeconfig. Not sure who's tree this should go through, last time these were lost in space and only the non-xen things got cherry picked later, who's tree should this go through? Luis R. Rodriguez (2): x86, platform, xen, kconfig: clarify kvmconfig is for kvm x86, arm, platform, xen, kconfig: add xen defconfig helper arch/x86/configs/xen.config | 6 ++ kernel/configs/xen.config | 32 scripts/kconfig/Makefile| 7 ++- 3 files changed, 44 insertions(+), 1 deletion(-) create mode 100644 arch/x86/configs/xen.config create mode 100644 kernel/configs/xen.config -- 2.1.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v14 3/7] KVM: x86: switch to kvm_get_dirty_log_protect
Hi Paolo, I took a closer look at Christoffers comment, the _log description in x86.c is a repeat of the _protect description in kvm_main.c. I'm wondering if description below would be acceptable, or perhaps you had a reason leaving it as is. For the ARM variant I would word same. Please advise. Thanks. /** * kvm_vm_ioctl_get_dirty_log - get and clear the log of dirty pages in a slot * @kvm: kvm instance * @log: slot id and address to which we copy the log * * Steps 1-4 below provide general overview of dirty page logging. See * kvm_get_dirty_log_protect() function description for additional details. * * We call kvm_get_dirty_log_protect() to handle steps 1-3, upon return we * always flush the TLB (step 4) even if 'protect' failed and dirty bitmap * may be corrupt. Regardless of previous outcome KVM logging API does not * preclude user space subsequent dirty log read. Flushing TLB insures writes * will be marked dirty for next log read. * * 1. Take a snapshot of the bit and clear it if needed. * 2. Write protect the corresponding page. * 3. Copy the snapshot to the userspace. * 4. Flush TLB's if needed. */ On 11/22/2014 11:19 AM, Christoffer Dall wrote: On Thu, Nov 13, 2014 at 05:57:44PM -0800, Mario Smarduch wrote: From: Paolo Bonzini pbonz...@redhat.com We now have a generic function that does most of the work of kvm_vm_ioctl_get_dirty_log, now use it. Signed-off-by: Mario Smarduch m.smard...@samsung.com --- arch/x86/include/asm/kvm_host.h |3 -- arch/x86/kvm/Kconfig|1 + arch/x86/kvm/mmu.c |4 +-- arch/x86/kvm/x86.c | 64 ++- 4 files changed, 12 insertions(+), 60 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 7c492ed..934dc24 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -805,9 +805,6 @@ void kvm_mmu_set_mask_ptes(u64 user_mask, u64 accessed_mask, void kvm_mmu_reset_context(struct kvm_vcpu *vcpu); void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot); -void kvm_mmu_write_protect_pt_masked(struct kvm *kvm, - struct kvm_memory_slot *slot, - gfn_t gfn_offset, unsigned long mask); void kvm_mmu_zap_all(struct kvm *kvm); void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm); unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm); diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index f9d16ff..d073594 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -39,6 +39,7 @@ config KVM select PERF_EVENTS select HAVE_KVM_MSI select HAVE_KVM_CPU_RELAX_INTERCEPT +select KVM_GENERIC_DIRTYLOG_READ_PROTECT select KVM_VFIO ---help--- Support hosting fully virtualized guest machines using hardware diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 9314678..bf6b82c 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -1224,7 +1224,7 @@ static bool __rmap_write_protect(struct kvm *kvm, unsigned long *rmapp, } /** - * kvm_mmu_write_protect_pt_masked - write protect selected PT level pages + * kvm_arch_mmu_write_protect_pt_masked - write protect selected PT level pages * @kvm: kvm instance * @slot: slot to protect * @gfn_offset: start of the BITS_PER_LONG pages we care about @@ -1233,7 +1233,7 @@ static bool __rmap_write_protect(struct kvm *kvm, unsigned long *rmapp, * Used when we do not need to care about huge page mappings: e.g. during dirty * logging we do not have any such mappings. */ -void kvm_mmu_write_protect_pt_masked(struct kvm *kvm, +void kvm_arch_mmu_write_protect_pt_masked(struct kvm *kvm, struct kvm_memory_slot *slot, gfn_t gfn_offset, unsigned long mask) { diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 8f1e22d..9f8ae9a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3606,77 +3606,31 @@ static int kvm_vm_ioctl_reinject(struct kvm *kvm, * * 1. Take a snapshot of the bit and clear it if needed. * 2. Write protect the corresponding page. - * 3. Flush TLB's if needed. - * 4. Copy the snapshot to the userspace. + * 3. Copy the snapshot to the userspace. + * 4. Flush TLB's if needed. * - * Between 2 and 3, the guest may write to the page using the remaining TLB - * entry. This is not a problem because the page will be reported dirty at - * step 4 using the snapshot taken before and step 3 ensures that successive - * writes will be logged for the next call. + * Between 2 and 4, the guest may write to the page using the remaining TLB + * entry. This is not a problem because the page is reported dirty using + * the snapshot taken before and step 4 ensures that writes done after + * exiting to userspace will be logged for the next call.
Re: [PATCH v3 0/2]: x86/arm64: add xenconfig
On Mon, Dec 08, 2014 at 03:04:58PM -0800, Luis R. Rodriguez wrote: From: Luis R. Rodriguez mcg...@suse.com This is based on some old set I had lying around. The virtconfig changes I had proposed a while ago got merged and reused for tinyconfig, this adapts my original set to use the new mergeconfig. Not sure who's tree this should go through, last time these were lost in space and only the non-xen things got cherry picked later, who's tree should this go through? Luis R. Rodriguez (2): x86, platform, xen, kconfig: clarify kvmconfig is for kvm x86, arm, platform, xen, kconfig: add xen defconfig helper For both: Reviewed-by: Josh Triplett j...@joshtriplett.org arch/x86/configs/xen.config | 6 ++ kernel/configs/xen.config | 32 scripts/kconfig/Makefile| 7 ++- 3 files changed, 44 insertions(+), 1 deletion(-) create mode 100644 arch/x86/configs/xen.config create mode 100644 kernel/configs/xen.config -- 2.1.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] KVM: x86: nested: support for MSR loading/storing
Thus it would be good to have kvm-unit-tests for all what is checked here. I'll try to implement unit tests a bit later. The fixed patch based on comments and critics from this thread is following. Indeed. Better have function than accepts the field index and that has some translation table to derive the name for printing the message. Is having translation table for such simple case an overkill? Maybe printing VMCS field indexes would be enough? Is white-listing a safe approach for this? Wouldn't it be better to save/restore only a known set of MSRs, possibly adding an option to ignore others (with a warning) instead of returning an error (similar to what we do when L1 tries to write to an unknown MSR)? MSRs are written via kvm_set_msr() path and MSR set restricting is done there. If ignored or unhandled MSR attempted to write then whole MSR loading transaction fails. This check (vmx_msr_switch_is_protected_msr) adds more strict check on u-code-related MSRs (this model-specified checks are specified in SDM 26.4, 27.4 and 35). kvm_set_msr() simply ignores write attempts to these MSRs (MSR_IA32_UCODE_WRITE, MSR_IA32_UCODE_REV)) and returns 0 (success). Instead we should fail here according to SDM. Same comment as above. kvm_{get/set}_msr() are safe, aren't they? +msr = nested_vmx_load_msrs(vcpu, vmcs12-vm_entry_msr_load_count, + vmcs12-vm_entry_msr_load_addr); +if (msr) +nested_vmx_entry_failure(vcpu, vmcs12, + EXIT_REASON_MSR_LOAD_FAILURE, msr); Don't you have to terminate the nested run attempt here as well - return 1? Fixed. -- Eugene -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] KVM: x86: nVMX: support for MSR loading/storing
Several hypervisors use MSR loading/storing to run guests. This patch implements emulation of this feature and allows these hypervisors to work in L1. The following is emulated: - Loading MSRs on VM-entries - Saving MSRs on VM-exits - Loading MSRs on VM-exits Actions taken on loading MSRs: - MSR load area is verified - For each MSR entry from load area: - MSR load entry is read and verified - MSR value is safely written Actions taken on storing MSRs: - MSR store area is verified - For each MSR entry from store area: - MSR entry is read and verified - MSR value is safely read using MSR index from MSR entry - MSR value is written to MSR entry The code performs checks required by Intel Software Developer Manual. This patch is partially based on Wincy Wan's work. Signed-off-by: Eugene Korenevsky ekorenev...@gmail.com --- arch/x86/include/asm/vmx.h| 6 + arch/x86/include/uapi/asm/msr-index.h | 3 + arch/x86/include/uapi/asm/vmx.h | 2 + arch/x86/kvm/vmx.c| 210 -- virt/kvm/kvm_main.c | 1 + 5 files changed, 215 insertions(+), 7 deletions(-) diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h index 45afaee..8bdb247 100644 --- a/arch/x86/include/asm/vmx.h +++ b/arch/x86/include/asm/vmx.h @@ -457,6 +457,12 @@ struct vmx_msr_entry { #define ENTRY_FAIL_VMCS_LINK_PTR 4 /* + * VMX Abort codes + */ +#define VMX_ABORT_MSR_STORE_FAILURE1 +#define VMX_ABORT_MSR_LOAD_FAILURE 4 + +/* * VM-instruction error numbers */ enum vm_instruction_error_number { diff --git a/arch/x86/include/uapi/asm/msr-index.h b/arch/x86/include/uapi/asm/msr-index.h index e21331c..3c9c601 100644 --- a/arch/x86/include/uapi/asm/msr-index.h +++ b/arch/x86/include/uapi/asm/msr-index.h @@ -316,6 +316,9 @@ #define MSR_IA32_UCODE_WRITE 0x0079 #define MSR_IA32_UCODE_REV 0x008b +#define MSR_IA32_SMM_MONITOR_CTL 0x009b +#define MSR_IA32_SMBASE0x009e + #define MSR_IA32_PERF_STATUS 0x0198 #define MSR_IA32_PERF_CTL 0x0199 #define MSR_AMD_PSTATE_DEF_BASE0xc0010064 diff --git a/arch/x86/include/uapi/asm/vmx.h b/arch/x86/include/uapi/asm/vmx.h index b813bf9..52ad8e2 100644 --- a/arch/x86/include/uapi/asm/vmx.h +++ b/arch/x86/include/uapi/asm/vmx.h @@ -56,6 +56,7 @@ #define EXIT_REASON_MSR_READ31 #define EXIT_REASON_MSR_WRITE 32 #define EXIT_REASON_INVALID_STATE 33 +#define EXIT_REASON_MSR_LOAD_FAILURE34 #define EXIT_REASON_MWAIT_INSTRUCTION 36 #define EXIT_REASON_MONITOR_INSTRUCTION 39 #define EXIT_REASON_PAUSE_INSTRUCTION 40 @@ -116,6 +117,7 @@ { EXIT_REASON_APIC_WRITE,APIC_WRITE }, \ { EXIT_REASON_EOI_INDUCED, EOI_INDUCED }, \ { EXIT_REASON_INVALID_STATE, INVALID_STATE }, \ + { EXIT_REASON_MSR_LOAD_FAILURE, MSR_LOAD_FAILURE }, \ { EXIT_REASON_INVD, INVD }, \ { EXIT_REASON_INVVPID, INVVPID }, \ { EXIT_REASON_INVPCID, INVPCID }, \ diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 9bcc871..86dc7db 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -8571,6 +8571,168 @@ static void prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12) kvm_register_write(vcpu, VCPU_REGS_RIP, vmcs12-guest_rip); } +static bool vmx_msr_switch_area_verify(struct kvm_vcpu *vcpu, + unsigned long count_field, + unsigned long addr_field, + int maxphyaddr) +{ + u64 count, addr; + + BUG_ON(vmcs12_read_any(vcpu, count_field, count)); + BUG_ON(vmcs12_read_any(vcpu, addr_field, addr)); + if (!IS_ALIGNED(addr, 16)) + goto fail; + if (addr maxphyaddr) + goto fail; + if ((addr + count * sizeof(struct vmx_msr_entry) - 1) maxphyaddr) + goto fail; + return true; +fail: + pr_warn_ratelimited( + nVMX: invalid MSR switch (0x%lx, 0x%lx, %d, %llu, 0x%08llx), + count_field, addr_field, maxphyaddr, count, addr); + return false; +} + +static bool nested_vmx_msr_switch_verify(struct kvm_vcpu *vcpu, +struct vmcs12 *vmcs12) +{ + int maxphyaddr; + + if (vmcs12-vm_exit_msr_load_count == 0 + vmcs12-vm_exit_msr_store_count == 0 + vmcs12-vm_entry_msr_load_count == 0) + return true; /* Fast path */ + maxphyaddr = cpuid_maxphyaddr(vcpu); + return vmx_msr_switch_area_verify(vcpu, VM_EXIT_MSR_LOAD_COUNT, + VM_EXIT_MSR_LOAD_ADDR, maxphyaddr) + vmx_msr_switch_area_verify(vcpu, VM_EXIT_MSR_STORE_COUNT, +
Re: [Qemu-devel] [RFC V2 10/10] cpus: reclaim allocated vCPU objects
+cc Gleb, KVM guys, On 12/09/2014 12:38 AM, Peter Maydell wrote: On 8 December 2014 at 15:38, Igor Mammedov imamm...@redhat.com wrote: On Mon, 8 Dec 2014 10:50:21 + Peter Maydell peter.mayd...@linaro.org wrote: Why can't the kernel handle our just destroying the vcpu and later recreating it if necessary? That seems the more logical approach than trying to keep fds hanging around in userspace for reuse. It's somewhat complex approach and it was suggested on KVM list to go parking route. for more details see thread https://www.mail-archive.com/kvm@vger.kernel.org/msg102839.html If the kernel can't cope with userspace creating and destroying vCPUs dynamically then that seems like a kernel bug to me. Yes, it's a flaw. It seems better to me to fix that directly rather than make non-x86 architectures change things around to help with working around that bug... Agree. But as we discussed before: CPU array is accessed locklessly in a lot of places, so it will have to be RCUified. There was attempt to do so 2 year or so ago, but it didn't go anyware. Adding locks is to big a price to pay for ability to free a little bit of memory by destroying vcpu. We worry about the regression if we add lock in a lot of places. I'm not very familiar with non-x86 architectures. So I'm not sure how long we need to go to help vcpu hot-unplug working with parking route. Gleb, Is any guys still working on the RCUing CPUarray access? Is there any plan for this issue, or just leave it as it is? Thanks, Gu thanks -- PMM . -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Windows 7 stable running setting
Hi All, I want to recap the previous discussion that I had BSOD due to following config is missing, after applied those settings, Windows 7 running a lot more stable than before, but BSOD still appear once in a while, anyone could show KVM options which makes Windows 7 stable running for months please? My missing settings: hv_relaxed hv_vapic hv_spinlocks, retries 8191 __ I read somewhere mention something related with hv_reftime, but Google didn't show any patch related to it. Any idea? -- Thomas Lau Director of Infrastructure Tetrion Capital Limited Direct: +852-3976-8903 Mobile: +852-9323-9670 Address: 20/F, IFC 1, Central district, Hong Kong -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [Intel-gfx] [ANNOUNCE][RFC] KVMGT - the implementation of Intel GVT-g(full GPU virtualization) for KVM
Here is some background of this KVMGT release: - the major purpose is for early experiment of this technique in KVM, and throw out issues about adding in-kernel device model (or mediated pass-through framework) in KVM. - KVMGT shares 90% code as XenGT, regarding to vGPU device model. The only difference is the in-kernel dm interface. The vGPU device model will be split and integrated in i915 driver. It will register to in-kernel dm framework provided either by Xen or KVM at boot time. Upstreaming of vGPU device model is already in progress, with valuable comments received from i915 community. However the refactoring mostly happen in XenGT repo now - Now we have XenGT/KVMGT separately maintained, and KVMGT lags behind XenGT regarding to features and qualities. Likely you'll continue see stale code (like Xen inst decoder) for some time. In the future we plan to maintain a single kernel repo for both, so KVMGT can share same quality as XenGT once KVM in-kernel dm framework is stable. - Regarding to Qemu hacks, KVMGT really doesn't have any different requirements as what have been discussed for GPU pass-through, e.g. about ISA bridge. Our implementation is based on an old Qemu repo, and honestly speaking not cleanly developed, because we know we can leverage from GPU pass-through support once it's in Qemu. At that time we'll leverage the same logic with minimal changes to hook KVMGT mgmt. APIs (e.g. create/destroy a vGPU instance). So we can ignore this area for now. :-) Thanks Kevin From: Paolo Bonzini Sent: Friday, December 05, 2014 9:04 PM On 05/12/2014 09:50, Gerd Hoffmann wrote: A few comments on the kernel stuff (brief look so far, also compile-tested only, intel gfx on my test machine is too old). * Noticed the kernel bits don't even compile when configured as module. Everything (vgt, i915, kvm) must be compiled into the kernel. I'll add that the patch is basically impossible to review with all the XenGT bits still in. For example, the x86 emulator seems to be unnecessary for KVMGT, but I am not 100% sure. I would like a clear understanding of why/how Andrew Barnes was able to do i915 passthrough (GVT-d) without hacking the ISA bridge, and why this does not apply to GVT-g. Paolo * Design approach still seems to be i915 on vgt not the other way around. Qemu/SeaBIOS bits: I've seen the host bridge changes identity from i440fx to copy-pci-ids-from-host. Guess the reason for this is that seabios uses this device to figure whenever it is running on i440fx or q35. Correct? What are the exact requirements for the device? Must it match the host exactly, to not confuse the guest intel graphics driver? Or would something more recent -- such as the q35 emulation qemu has -- be good enough to make things work (assuming we add support for the graphic-related pci config space registers there)? The patch also adds a dummy isa bridge at 0x1f. Simliar question here: What exactly is needed here? Would things work if we simply use the q35 lpc device here? more to come after I've read the paper linked above ... cheers, Gerd ___ Intel-gfx mailing list intel-...@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ Intel-gfx mailing list intel-...@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
RE: [Intel-gfx] [ANNOUNCE][RFC] KVMGT - the implementation of Intel GVT-g(full GPU virtualization) for KVM
From: Daniel Vetter Sent: Monday, December 08, 2014 6:21 PM On Mon, Dec 08, 2014 at 10:55:01AM +0100, Gerd Hoffmann wrote: On Sa, 2014-12-06 at 12:17 +0800, Jike Song wrote: I don't know that is exactly needed, we also need to have Windows driver considered. However, I'm quite confident that, if things gonna work for IGD passthrough, it gonna work for GVT-g. I'd suggest to focus on q35 emulation. q35 is new enough that a version with integrated graphics exists, so the gap we have to close is *much* smaller. In case guests expect a northbridge matching the chipset generation of the graphics device (which I'd expect is the case, after digging a bit in the igd and agpgart linux driver code) I think we should add proper device emulation for them, i.e. comply q35-pcihost with sandybridge-pcihost + ivybridge-pcihost + haswell-pcihost instead of just copying over the pci ids from the host. Most likely all those variants can share most of the emulation code. I don't think i915.ko should care about either northbridge nor pch on para-virtualized platforms. We do noodle around in there for the oddball memory controller setting and for some display stuff. But neither of that really applies to paravirtualized hw. And if there's any case like that we should patch it out (like we do with some of the runtime pm code already). Agree. Now Allen is working on how to avoid those tricky platform stickiness in Windows gfx driver. We should do same thing in Linux part too. Thanks Kevin
Windows 7 0x0000005C
Hi All, When I try to shutdown Windows 7 on KVM, it shows 0x05C BSOD, anyone know why it happens? -- Thomas Lau Director of Infrastructure Tetrion Capital Limited Direct: +852-3976-8903 Mobile: +852-9323-9670 Address: 20/F, IFC 1, Central district, Hong Kong -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Windows 7 VM BSOD
Hi Vadim, I want to quote back to your original post back in early 2014: https://www.mail-archive.com/kvm@vger.kernel.org/msg99782.html According to http://msdn.microsoft.com/en-us/library/windows/hardware/ff559069(v=vs.85).aspx the 0x5C means HAL_INITIALIZATION_FAILED Problem matched exactly, which I am using CPU IvyBridge-EP and I got same BSOD as well. Are we missing some hyperv feature? On Wed, Dec 3, 2014 at 7:29 PM, Vadim Rozenfeld vroze...@redhat.com wrote: If you run WS2008(R2) or Win7 - always turn on relaxed timing. Otherwise it's just a matter of time when you hit 101 BOSD. Bugcheck 78 is quite rare one. What is your setup, and how easy it's reproducible? Best regards, Vadim. On Wed, 2014-12-03 at 19:13 +0800, Thomas Lau wrote: it works on your side meaning that you had such issue but afterwards it's all fixed by apply hv_relaxed ? On Wed, Dec 3, 2014 at 7:08 PM, Zhang Haoyu zhan...@sangfor.com wrote: https://bugzilla.redhat.com/show_bug.cgi?id=893857 In fact I am doing testing now, but are we fixing one problem and introduce other problem?! I'm not sure about this, but it works on my side, I think BSOD(error:0x0078) has been fixed, please show your environment. Thanks, Zhang Haoyu On Wed, Dec 3, 2014 at 6:36 PM, Thomas Lau t...@tetrioncapital.com wrote: Hi, How do I know if my qemu-kvm version support this? On Wed, Dec 3, 2014 at 6:25 PM, Zhang Haoyu zhan...@sangfor.com wrote: Hi All, I am running 3.13.0-24-generic kernel on Ubuntu 14, Windows 7 VM installation was fine, but it does random reboot by itself, the error code is 0x0101, does anyone know how to fix this? Could you try hv_relaxed, like -cpu kvm64,hv_relaxed. Thanks, Zhang Haoyu -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Thomas Lau Director of Infrastructure Tetrion Capital Limited Direct: +852-3976-8903 Mobile: +852-9323-9670 Address: 20/F, IFC 1, Central district, Hong Kong -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Windows 7 VM BSOD
On Tue, 2014-12-09 at 11:54 +0800, Thomas Lau wrote: Hi Vadim, I want to quote back to your original post back in early 2014: https://www.mail-archive.com/kvm@vger.kernel.org/msg99782.html According to http://msdn.microsoft.com/en-us/library/windows/hardware/ff559069(v=vs.85).aspx the 0x5C means HAL_INITIALIZATION_FAILED Problem matched exactly, which I am using CPU IvyBridge-EP and I got same BSOD as well. Some CPU flags (feature bits) should be missing. Can you try changing cpu type? Best regards, Vadim. Are we missing some hyperv feature? On Wed, Dec 3, 2014 at 7:29 PM, Vadim Rozenfeld vroze...@redhat.com wrote: If you run WS2008(R2) or Win7 - always turn on relaxed timing. Otherwise it's just a matter of time when you hit 101 BOSD. Bugcheck 78 is quite rare one. What is your setup, and how easy it's reproducible? Best regards, Vadim. On Wed, 2014-12-03 at 19:13 +0800, Thomas Lau wrote: it works on your side meaning that you had such issue but afterwards it's all fixed by apply hv_relaxed ? On Wed, Dec 3, 2014 at 7:08 PM, Zhang Haoyu zhan...@sangfor.com wrote: https://bugzilla.redhat.com/show_bug.cgi?id=893857 In fact I am doing testing now, but are we fixing one problem and introduce other problem?! I'm not sure about this, but it works on my side, I think BSOD(error:0x0078) has been fixed, please show your environment. Thanks, Zhang Haoyu On Wed, Dec 3, 2014 at 6:36 PM, Thomas Lau t...@tetrioncapital.com wrote: Hi, How do I know if my qemu-kvm version support this? On Wed, Dec 3, 2014 at 6:25 PM, Zhang Haoyu zhan...@sangfor.com wrote: Hi All, I am running 3.13.0-24-generic kernel on Ubuntu 14, Windows 7 VM installation was fine, but it does random reboot by itself, the error code is 0x0101, does anyone know how to fix this? Could you try hv_relaxed, like -cpu kvm64,hv_relaxed. Thanks, Zhang Haoyu -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Windows 7 VM BSOD
I changed CPU type to Westmere, it boot up with 0x05C BSOD On Tue, Dec 9, 2014 at 3:10 PM, Vadim Rozenfeld vroze...@redhat.com wrote: On Tue, 2014-12-09 at 11:54 +0800, Thomas Lau wrote: Hi Vadim, I want to quote back to your original post back in early 2014: https://www.mail-archive.com/kvm@vger.kernel.org/msg99782.html According to http://msdn.microsoft.com/en-us/library/windows/hardware/ff559069(v=vs.85).aspx the 0x5C means HAL_INITIALIZATION_FAILED Problem matched exactly, which I am using CPU IvyBridge-EP and I got same BSOD as well. Some CPU flags (feature bits) should be missing. Can you try changing cpu type? Best regards, Vadim. Are we missing some hyperv feature? On Wed, Dec 3, 2014 at 7:29 PM, Vadim Rozenfeld vroze...@redhat.com wrote: If you run WS2008(R2) or Win7 - always turn on relaxed timing. Otherwise it's just a matter of time when you hit 101 BOSD. Bugcheck 78 is quite rare one. What is your setup, and how easy it's reproducible? Best regards, Vadim. On Wed, 2014-12-03 at 19:13 +0800, Thomas Lau wrote: it works on your side meaning that you had such issue but afterwards it's all fixed by apply hv_relaxed ? On Wed, Dec 3, 2014 at 7:08 PM, Zhang Haoyu zhan...@sangfor.com wrote: https://bugzilla.redhat.com/show_bug.cgi?id=893857 In fact I am doing testing now, but are we fixing one problem and introduce other problem?! I'm not sure about this, but it works on my side, I think BSOD(error:0x0078) has been fixed, please show your environment. Thanks, Zhang Haoyu On Wed, Dec 3, 2014 at 6:36 PM, Thomas Lau t...@tetrioncapital.com wrote: Hi, How do I know if my qemu-kvm version support this? On Wed, Dec 3, 2014 at 6:25 PM, Zhang Haoyu zhan...@sangfor.com wrote: Hi All, I am running 3.13.0-24-generic kernel on Ubuntu 14, Windows 7 VM installation was fine, but it does random reboot by itself, the error code is 0x0101, does anyone know how to fix this? Could you try hv_relaxed, like -cpu kvm64,hv_relaxed. Thanks, Zhang Haoyu -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Thomas Lau Director of Infrastructure Tetrion Capital Limited Direct: +852-3976-8903 Mobile: +852-9323-9670 Address: 20/F, IFC 1, Central district, Hong Kong -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html