[PATCH] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
From: Avi Kivity a...@redhat.com Conflicts: arch/x86/include/asm/kvm.h Signed-off-by: Avi Kivity a...@redhat.com -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] kvm: qemu: fix host_cpuid() on i386
From: Avi Kivity a...@redhat.com The addition of the ecx parameter broke cpuid on i386 as the constraints changed. Signed-off-by: Avi Kivity a...@redhat.com diff --git a/qemu/target-i386/helper.c b/qemu/target-i386/helper.c index 08e26bf..6f20e9d 100644 --- a/qemu/target-i386/helper.c +++ b/qemu/target-i386/helper.c @@ -1425,10 +1425,10 @@ static void host_cpuid(uint32_t function, uint32_t count, #else asm volatile(pusha \n\t cpuid \n\t - mov %%eax, 0(%1) \n\t - mov %%ebx, 4(%1) \n\t - mov %%ecx, 8(%1) \n\t - mov %%edx, 12(%1) \n\t + mov %%eax, 0(%2) \n\t + mov %%ebx, 4(%2) \n\t + mov %%ecx, 8(%2) \n\t + mov %%edx, 12(%2) \n\t popa : : a(function), c(count), S(vec) : memory, cc); -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Current KVM head crashes on startup
On (Wed) Feb 18 2009 [13:21:26], Amit Shah wrote: On (Tue) Feb 17 2009 [12:47:10], Brian Kress wrote: When I try to run KVM built off the current head, it crashes with a Segmentation fault. KVM-84 does not. Seems to be dealing with the CPUID changes: 0x081a5c70 in host_cpuid () at /home/kressb/kvm/src/qemu/target-i386/helper.c:1426 1426asm volatile(pusha \n\t This looks like some kind of stack corruption on 32-bit: 1472if (kvm_enabled()) (gdb) 1473host_cpuid(0, 0, NULL, ebx, ecx, edx); (gdb) Program received signal SIGSEGV, Segmentation fault. 0x081a2d60 in host_cpuid (function=10, count=1231384169, eax=0x0, ebx=0xadfc1914, ecx=0xadfc1910, edx=0xadfc190c) at /home/amit/src/kvm-userspace/qemu/target-i386/helper.c:1426 1426asm volatile(pusha \n\t I don't see this on 64-bit. Investigating. Avi, what's the reason for doing this in the host_cpuid code? As I see it, the first version should work for both 64-bit and 32-bit code. #ifdef __x86_64__ asm volatile(cpuid : =a(vec[0]), =b(vec[1]), =c(vec[2]), =d(vec[3]) : 0(function), c(count) : cc); #else asm volatile(pusha \n\t cpuid \n\t mov %%eax, 0(%1) \n\t mov %%ebx, 4(%1) \n\t mov %%ecx, 8(%1) \n\t mov %%edx, 12(%1) \n\t popa : : a(function), c(count), S(vec) : memory, cc); #endif Amit -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Current KVM head crashes on startup
Amit Shah wrote: On (Wed) Feb 18 2009 [13:21:26], Amit Shah wrote: On (Tue) Feb 17 2009 [12:47:10], Brian Kress wrote: When I try to run KVM built off the current head, it crashes with a Segmentation fault. KVM-84 does not. Seems to be dealing with the CPUID changes: 0x081a5c70 in host_cpuid () at /home/kressb/kvm/src/qemu/target-i386/helper.c:1426 1426asm volatile(pusha \n\t This looks like some kind of stack corruption on 32-bit: 1472if (kvm_enabled()) (gdb) 1473host_cpuid(0, 0, NULL, ebx, ecx, edx); (gdb) Program received signal SIGSEGV, Segmentation fault. 0x081a2d60 in host_cpuid (function=10, count=1231384169, eax=0x0, ebx=0xadfc1914, ecx=0xadfc1910, edx=0xadfc190c) at /home/amit/src/kvm-userspace/qemu/target-i386/helper.c:1426 1426asm volatile(pusha \n\t I don't see this on 64-bit. Investigating. Avi, what's the reason for doing this in the host_cpuid code? As I see it, the first version should work for both 64-bit and 32-bit code. #ifdef __x86_64__ asm volatile(cpuid : =a(vec[0]), =b(vec[1]), =c(vec[2]), =d(vec[3]) : 0(function), c(count) : cc); #else asm volatile(pusha \n\t cpuid \n\t mov %%eax, 0(%1) \n\t mov %%ebx, 4(%1) \n\t mov %%ecx, 8(%1) \n\t mov %%edx, 12(%1) \n\t popa : : a(function), c(count), S(vec) : memory, cc); #endif The first version generates too much register pressure for some compilers on i386, leading to compilation failures. The second version is surely wrong, though? Counting from zero, the vec parameter would be %2, not %1. (copied Anthony) -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Current KVM head crashes on startup
On (Wed) Feb 18 2009 [08:49:33], Avi Kivity wrote: Amit Shah wrote: On (Wed) Feb 18 2009 [13:21:26], Amit Shah wrote: On (Tue) Feb 17 2009 [12:47:10], Brian Kress wrote: When I try to run KVM built off the current head, it crashes with a Segmentation fault. KVM-84 does not. Seems to be dealing with the CPUID changes: 0x081a5c70 in host_cpuid () at /home/kressb/kvm/src/qemu/target-i386/helper.c:1426 1426asm volatile(pusha \n\t This looks like some kind of stack corruption on 32-bit: 1472if (kvm_enabled()) (gdb) 1473host_cpuid(0, 0, NULL, ebx, ecx, edx); (gdb) Program received signal SIGSEGV, Segmentation fault. 0x081a2d60 in host_cpuid (function=10, count=1231384169, eax=0x0, ebx=0xadfc1914, ecx=0xadfc1910, edx=0xadfc190c) at /home/amit/src/kvm-userspace/qemu/target-i386/helper.c:1426 1426asm volatile(pusha \n\t I don't see this on 64-bit. Investigating. Avi, what's the reason for doing this in the host_cpuid code? As I see it, the first version should work for both 64-bit and 32-bit code. #ifdef __x86_64__ asm volatile(cpuid : =a(vec[0]), =b(vec[1]), =c(vec[2]), =d(vec[3]) : 0(function), c(count) : cc); #else asm volatile(pusha \n\t cpuid \n\t mov %%eax, 0(%1) \n\t mov %%ebx, 4(%1) \n\t mov %%ecx, 8(%1) \n\t mov %%edx, 12(%1) \n\t popa : : a(function), c(count), S(vec) : memory, cc); #endif The first version generates too much register pressure for some compilers on i386, leading to compilation failures. The second version Is it still valid? I tried with gcc-4.1.2 and that worked fine with the first version. Should we just use that version instead? is surely wrong, though? Counting from zero, the vec parameter would be %2, not %1. Looks like I missed out updating that when I introduced 'count'. Fixing that fixes the problem. Amit -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/3 v3] MSI-X enabling
Updated the patchset followed Marcelo and Avi's comments. Please also review MSI/MSI-X userspace patch as well. Thanks! -- regards Yang, Sheng -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] KVM: Enable MSI-X for KVM assigned device
This patch finally enable MSI-X. What we need for MSI-X: 1. Intercept one page in MMIO region of device. So that we can get guest desired MSI-X table and set up the real one. Now this have been done by guest, and transfer to kernel using ioctl KVM_SET_MSIX_NR and KVM_SET_MSIX_ENTRY. 2. Information for incoming interrupt. Now one device can have more than one interrupt, and they are all handled by one workqueue structure. So we need to identify them. The previous patch enable gsi_msg_pending_bitmap get this done. 3. Mapping from host IRQ to guest gsi as well as guest gsi to real MSI/MSI-X message address/data. We used same entry number for the host and guest here, so that it's easy to find the correlated guest gsi. What we lack for now: 1. The PCI spec said nothing can existed with MSI-X table in the same page of MMIO region, except pending bits. The patch ignore pending bits as the first step (so they are always 0 - no pending). 2. The PCI spec allowed to change MSI-X table dynamically. That means, the OS can enable MSI-X, then mask one MSI-X entry, modify it, and unmask it. The patch didn't support this, and Linux also don't work in this way. 3. The patch didn't implement MSI-X mask all and mask single entry. I would implement the former in driver/pci/msi.c later. And for single entry, userspace should have reposibility to handle it. Signed-off-by: Sheng Yang sh...@linux.intel.com --- include/linux/kvm.h |8 virt/kvm/kvm_main.c | 107 --- 2 files changed, 109 insertions(+), 6 deletions(-) diff --git a/include/linux/kvm.h b/include/linux/kvm.h index a2dfbe0..78480d0 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -440,6 +440,9 @@ struct kvm_irq_routing { }; #endif +#if defined(CONFIG_X86) +#define KVM_CAP_DEVICE_MSIX 26 +#endif /* * ioctls for VM fds @@ -597,6 +600,11 @@ struct kvm_assigned_irq { #define KVM_DEV_IRQ_ASSIGN_MSI_ACTION KVM_DEV_IRQ_ASSIGN_ENABLE_MSI #define KVM_DEV_IRQ_ASSIGN_ENABLE_MSI (1 0) +#define KVM_DEV_IRQ_ASSIGN_MSIX_ACTION (KVM_DEV_IRQ_ASSIGN_ENABLE_MSIX |\ + KVM_DEV_IRQ_ASSIGN_MASK_MSIX) +#define KVM_DEV_IRQ_ASSIGN_ENABLE_MSIX (1 1) +#define KVM_DEV_IRQ_ASSIGN_MASK_MSIX(1 2) + struct kvm_assigned_msix_nr { __u32 assigned_dev_id; __u16 entry_nr; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 4010802..d3acb37 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -280,13 +280,33 @@ static void kvm_free_assigned_irq(struct kvm *kvm, * now, the kvm state is still legal for probably we also have to wait * interrupt_work done. */ - disable_irq_nosync(assigned_dev-host_irq); - cancel_work_sync(assigned_dev-interrupt_work); + if (assigned_dev-irq_requested_type KVM_ASSIGNED_DEV_MSIX) { + int i; + for (i = 0; i assigned_dev-entries_nr; i++) + disable_irq_nosync(assigned_dev- + host_msix_entries[i].vector); + + cancel_work_sync(assigned_dev-interrupt_work); + + for (i = 0; i assigned_dev-entries_nr; i++) + free_irq(assigned_dev-host_msix_entries[i].vector, +(void *)assigned_dev); + + assigned_dev-entries_nr = 0; + kfree(assigned_dev-host_msix_entries); + kfree(assigned_dev-guest_msix_entries); + pci_disable_msix(assigned_dev-dev); + } else { + /* Deal with MSI and INTx */ + disable_irq_nosync(assigned_dev-host_irq); + cancel_work_sync(assigned_dev-interrupt_work); - free_irq(assigned_dev-host_irq, (void *)assigned_dev); + free_irq(assigned_dev-host_irq, (void *)assigned_dev); - if (assigned_dev-irq_requested_type KVM_ASSIGNED_DEV_HOST_MSI) - pci_disable_msi(assigned_dev-dev); + if (assigned_dev-irq_requested_type + KVM_ASSIGNED_DEV_HOST_MSI) + pci_disable_msi(assigned_dev-dev); + } assigned_dev-irq_requested_type = 0; } @@ -415,6 +435,69 @@ static int assigned_device_update_msi(struct kvm *kvm, adev-irq_requested_type |= KVM_ASSIGNED_DEV_HOST_MSI; return 0; } + +static int assigned_device_update_msix(struct kvm *kvm, + struct kvm_assigned_dev_kernel *adev, + struct kvm_assigned_irq *airq) +{ + /* TODO Deal with KVM_DEV_IRQ_ASSIGNED_MASK_MSIX */ + int i, r; + + adev-ack_notifier.gsi = -1; + + if (irqchip_in_kernel(kvm)) { + if (airq-flags KVM_DEV_IRQ_ASSIGN_MASK_MSIX) { + printk(KERN_WARNING + kvm: unsupported mask MSI-X, flags 0x%x!\n, + airq-flags); + return 0; +
[PATCH 2/3] KVM: Add gsi_msg_pending_bitmap for MSI-X
We have to handle more than one interrupt with one handler for MSI-X. So we need a bitmap to track the triggered interrupts. Signed-off-by: Sheng Yang sh...@linux.intel.com --- include/linux/kvm_host.h |5 +- virt/kvm/kvm_main.c | 102 - 2 files changed, 102 insertions(+), 5 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index b105ada..6e354af 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -145,6 +145,8 @@ struct kvm { #ifdef CONFIG_HAVE_KVM_IRQCHIP struct list_head irq_routing; /* of kvm_kernel_irq_routing_entry */ struct hlist_head mask_notifier_list; +#define KVM_MAX_IRQ_ROUTES 1024 + DECLARE_BITMAP(irq_routes_pending_bitmap, KVM_MAX_IRQ_ROUTES); #endif #ifdef KVM_ARCH_WANT_MMU_NOTIFIER @@ -336,6 +338,7 @@ struct kvm_assigned_dev_kernel { #define KVM_ASSIGNED_DEV_GUEST_MSI (1 1) #define KVM_ASSIGNED_DEV_HOST_INTX (1 8) #define KVM_ASSIGNED_DEV_HOST_MSI (1 9) +#define KVM_ASSIGNED_DEV_MSIX ((1 2) | (1 10)) unsigned long irq_requested_type; int irq_source_id; int flags; @@ -503,8 +506,6 @@ static inline int mmu_notifier_retry(struct kvm_vcpu *vcpu, unsigned long mmu_se #ifdef CONFIG_HAVE_KVM_IRQCHIP -#define KVM_MAX_IRQ_ROUTES 1024 - int kvm_setup_default_irq_routing(struct kvm *kvm); int kvm_set_irq_routing(struct kvm *kvm, const struct kvm_irq_routing_entry *entries, diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index bb4aa73..4010802 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -95,25 +95,113 @@ static struct kvm_assigned_dev_kernel *kvm_find_assigned_dev(struct list_head *h return NULL; } +static int find_host_irq_from_gsi(struct kvm_assigned_dev_kernel *assigned_dev, + u32 gsi) +{ + int i, entry, irq; + struct msix_entry *host_msix_entries, *guest_msix_entries; + + host_msix_entries = assigned_dev-host_msix_entries; + guest_msix_entries = assigned_dev-guest_msix_entries; + + entry = -1; + irq = 0; + for (i = 0; i assigned_dev-entries_nr; i++) + if (gsi == (guest_msix_entries + i)-vector) { + entry = (guest_msix_entries + i)-entry; + break; + } + if (entry 0) { + printk(KERN_WARNING Fail to find correlated MSI-X entry!\n); + return 0; + } + for (i = 0; i assigned_dev-entries_nr; i++) + if (entry == (host_msix_entries + i)-entry) { + irq = (host_msix_entries + i)-vector; + break; + } + if (irq == 0) { + printk(KERN_WARNING Fail to find correlated MSI-X irq!\n); + return 0; + } + + return irq; +} + +static int find_gsi_from_host_irq(struct kvm_assigned_dev_kernel *assigned_dev, + int irq) +{ + int i, entry, gsi; + struct msix_entry *host_msix_entries, *guest_msix_entries; + + host_msix_entries = assigned_dev-host_msix_entries; + guest_msix_entries = assigned_dev-guest_msix_entries; + + entry = -1; + gsi = -1; + for (i = 0; i assigned_dev-entries_nr; i++) + if (irq == (host_msix_entries + i)-vector) { + entry = (host_msix_entries + i)-entry; + break; + } + if (entry 0) { + printk(KERN_WARNING Fail to find correlated MSI-X entry!\n); + return 0; + } + for (i = 0; i assigned_dev-entries_nr; i++) + if (entry == (guest_msix_entries + i)-entry) { + gsi = (guest_msix_entries + i)-vector; + break; + } + if (gsi 0) { + printk(KERN_WARNING Fail to find correlated MSI-X gsi!\n); + return 0; + } + + return gsi; +} + static void kvm_assigned_dev_interrupt_work_handler(struct work_struct *work) { struct kvm_assigned_dev_kernel *assigned_dev; + struct kvm *kvm; + u32 gsi; + int irq; assigned_dev = container_of(work, struct kvm_assigned_dev_kernel, interrupt_work); + kvm = assigned_dev-kvm; /* This is taken to safely inject irq inside the guest. When * the interrupt injection (or the ioapic code) uses a * finer-grained lock, update this */ - mutex_lock(assigned_dev-kvm-lock); - kvm_set_irq(assigned_dev-kvm, assigned_dev-irq_source_id, - assigned_dev-guest_irq, 1); + mutex_lock(kvm-lock); +handle_irq: + if (assigned_dev-irq_requested_type KVM_ASSIGNED_DEV_MSIX) { + gsi = find_first_bit(kvm-irq_routes_pending_bitmap, +KVM_MAX_IRQ_ROUTES); +
[PATCH 1/3] KVM: Ioctls for init MSI-X entry
Introduce KVM_SET_MSIX_NR and KVM_SET_MSIX_ENTRY two ioctls. This two ioctls are used by userspace to specific guest device MSI-X entry number and correlate MSI-X entry with GSI during the initialization stage. MSI-X should be well initialzed before enabling. Don't support change MSI-X entry number for now. Signed-off-by: Sheng Yang sh...@linux.intel.com --- include/linux/kvm.h | 16 +++ include/linux/kvm_host.h |3 + virt/kvm/kvm_main.c | 103 ++ 3 files changed, 122 insertions(+), 0 deletions(-) diff --git a/include/linux/kvm.h b/include/linux/kvm.h index 2163b3d..a2dfbe0 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -475,6 +475,8 @@ struct kvm_irq_routing { #define KVM_ASSIGN_IRQ _IOR(KVMIO, 0x70, \ struct kvm_assigned_irq) #define KVM_REINJECT_CONTROL _IO(KVMIO, 0x71) +#define KVM_SET_MSIX_NR _IOR(KVMIO, 0x72, struct kvm_assigned_msix_nr) +#define KVM_SET_MSIX_ENTRY _IOR(KVMIO, 0x73, struct kvm_assigned_msix_entry) /* * ioctls for vcpu fds @@ -595,4 +597,18 @@ struct kvm_assigned_irq { #define KVM_DEV_IRQ_ASSIGN_MSI_ACTION KVM_DEV_IRQ_ASSIGN_ENABLE_MSI #define KVM_DEV_IRQ_ASSIGN_ENABLE_MSI (1 0) +struct kvm_assigned_msix_nr { + __u32 assigned_dev_id; + __u16 entry_nr; + __u16 padding; +}; + +#define KVM_MAX_MSIX_PER_DEV 512 +struct kvm_assigned_msix_entry { + __u32 assigned_dev_id; + __u32 gsi; + __u16 entry; /* The index of entry in the MSI-X table */ + __u16 padding[3]; +}; + #endif diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 7c7096d..b105ada 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -326,9 +326,12 @@ struct kvm_assigned_dev_kernel { int assigned_dev_id; int host_busnr; int host_devfn; + unsigned int entries_nr; int host_irq; bool host_irq_disabled; + struct msix_entry *host_msix_entries; int guest_irq; + struct msix_entry *guest_msix_entries; #define KVM_ASSIGNED_DEV_GUEST_INTX(1 0) #define KVM_ASSIGNED_DEV_GUEST_MSI (1 1) #define KVM_ASSIGNED_DEV_HOST_INTX (1 8) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 266bdaf..bb4aa73 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1593,6 +1593,87 @@ static int kvm_vcpu_ioctl_set_sigmask(struct kvm_vcpu *vcpu, sigset_t *sigset) return 0; } +static int kvm_vm_ioctl_set_msix_nr(struct kvm *kvm, + struct kvm_assigned_msix_nr *entry_nr) +{ + int r = 0; + struct kvm_assigned_dev_kernel *adev; + + mutex_lock(kvm-lock); + + adev = kvm_find_assigned_dev(kvm-arch.assigned_dev_head, + entry_nr-assigned_dev_id); + if (!adev) { + r = -EINVAL; + goto msix_nr_out; + } + + if (adev-entries_nr == 0) { + adev-entries_nr = entry_nr-entry_nr; + if (adev-entries_nr == 0 || + adev-entries_nr = KVM_MAX_MSIX_PER_DEV) + goto msix_nr_out; + + adev-host_msix_entries = kzalloc(sizeof(struct msix_entry) * + entry_nr-entry_nr, + GFP_KERNEL); + if (!adev-host_msix_entries) { + printk(KERN_ERR no memory for host msix entries!\n); + r = -ENOMEM; + goto msix_nr_out; + } + adev-guest_msix_entries = kzalloc(sizeof(struct msix_entry) * + entry_nr-entry_nr, + GFP_KERNEL); + if (!adev-guest_msix_entries) { + printk(KERN_ERR no memory for host msix entries!\n); + kfree(adev-host_msix_entries); + r = -ENOMEM; + goto msix_nr_out; + } + } else + printk(KERN_WARNING kvm: not allow recall set msix nr!\n); +msix_nr_out: + mutex_unlock(kvm-lock); + return r; +} + +static int kvm_vm_ioctl_set_msix_entry(struct kvm *kvm, + struct kvm_assigned_msix_entry *entry) +{ + int r = 0, i; + struct kvm_assigned_dev_kernel *adev; + + mutex_lock(kvm-lock); + + adev = kvm_find_assigned_dev(kvm-arch.assigned_dev_head, + entry-assigned_dev_id); + + if (!adev) { + r = -EINVAL; + goto msix_entry_out; + } + + for (i = 0; i adev-entries_nr; i++) + if (adev-guest_msix_entries[i].vector == 0 || + adev-guest_msix_entries[i].entry == entry-entry) { + adev-guest_msix_entries[i].entry = entry-entry; +
Re: Current KVM head crashes on startup
Amit Shah wrote: The first version generates too much register pressure for some compilers on i386, leading to compilation failures. The second version Is it still valid? I tried with gcc-4.1.2 and that worked fine with the first version. Should we just use that version instead? I don't see why it would change, unless you can destroy all copies of the compilers that fail with it. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] KVM: Ioctls for init MSI-X entry
Sheng Yang wrote: Introduce KVM_SET_MSIX_NR and KVM_SET_MSIX_ENTRY two ioctls. This two ioctls are used by userspace to specific guest device MSI-X entry number and correlate MSI-X entry with GSI during the initialization stage. MSI-X should be well initialzed before enabling. Don't support change MSI-X entry number for now. Sorry, this has been reviewed quite a bit but I found a few issues: diff --git a/include/linux/kvm.h b/include/linux/kvm.h index 2163b3d..a2dfbe0 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -475,6 +475,8 @@ struct kvm_irq_routing { #define KVM_ASSIGN_IRQ _IOR(KVMIO, 0x70, \ struct kvm_assigned_irq) #define KVM_REINJECT_CONTROL _IO(KVMIO, 0x71) +#define KVM_SET_MSIX_NR _IOR(KVMIO, 0x72, struct kvm_assigned_msix_nr) +#define KVM_SET_MSIX_ENTRY _IOR(KVMIO, 0x73, struct kvm_assigned_msix_entry) KVM_SET_ASSIGNED_... so it's associated with device assignment, not generic. Should be _IOW, not _IOR. Looks like KVM_ASSIGN_IRQ is broken... +static int kvm_vm_ioctl_set_msix_nr(struct kvm *kvm, + struct kvm_assigned_msix_nr *entry_nr) +{ + int r = 0; + struct kvm_assigned_dev_kernel *adev; + + mutex_lock(kvm-lock); + + adev = kvm_find_assigned_dev(kvm-arch.assigned_dev_head, + entry_nr-assigned_dev_id); + if (!adev) { + r = -EINVAL; + goto msix_nr_out; + } + + if (adev-entries_nr == 0) { + adev-entries_nr = entry_nr-entry_nr; + if (adev-entries_nr == 0 || + adev-entries_nr = KVM_MAX_MSIX_PER_DEV) + goto msix_nr_out; r == 0 here, needs a meaningful error number. + + adev-host_msix_entries = kzalloc(sizeof(struct msix_entry) * + entry_nr-entry_nr, + GFP_KERNEL); + if (!adev-host_msix_entries) { + printk(KERN_ERR no memory for host msix entries!\n); + r = -ENOMEM; Drop the printk, -ENOMEM is enough. + goto msix_nr_out; + } + adev-guest_msix_entries = kzalloc(sizeof(struct msix_entry) * + entry_nr-entry_nr, + GFP_KERNEL); + if (!adev-guest_msix_entries) { + printk(KERN_ERR no memory for host msix entries!\n); Ditto. + kfree(adev-host_msix_entries); + r = -ENOMEM; + goto msix_nr_out; + } + } else + printk(KERN_WARNING kvm: not allow recall set msix nr!\n); Drop printk, add error. +msix_nr_out: + mutex_unlock(kvm-lock); + return r; +} + +static int kvm_vm_ioctl_set_msix_entry(struct kvm *kvm, + struct kvm_assigned_msix_entry *entry) +{ + int r = 0, i; + struct kvm_assigned_dev_kernel *adev; + + mutex_lock(kvm-lock); + + adev = kvm_find_assigned_dev(kvm-arch.assigned_dev_head, + entry-assigned_dev_id); + + if (!adev) { + r = -EINVAL; + goto msix_entry_out; + } + + for (i = 0; i adev-entries_nr; i++) + if (adev-guest_msix_entries[i].vector == 0 || + adev-guest_msix_entries[i].entry == entry-entry) { + adev-guest_msix_entries[i].entry = entry-entry; + adev-guest_msix_entries[i].vector = entry-gsi; + adev-host_msix_entries[i].entry = entry-entry; + break; + } + if (i == adev-entries_nr) { + printk(KERN_ERR kvm: Too much entries for MSI-X!\n); Drop. + r = -ENOSPC; + goto msix_entry_out; + } + +msix_entry_out: + mutex_unlock(kvm-lock); + + return r; +} + static long kvm_vcpu_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg) { @@ -1917,7 +1998,29 @@ static long kvm_vm_ioctl(struct file *filp, vfree(entries); break; } +#ifdef KVM_CAP_DEVICE_MSIX + case KVM_SET_MSIX_NR: { + struct kvm_assigned_msix_nr entry_nr; + r = -EFAULT; + if (copy_from_user(entry_nr, argp, sizeof entry_nr)) + goto out; + r = kvm_vm_ioctl_set_msix_nr(kvm, entry_nr); + if (r) + goto out; + break; + } + case KVM_SET_MSIX_ENTRY: { + struct kvm_assigned_msix_entry entry; + r = -EFAULT; + if (copy_from_user(entry, argp, sizeof entry)) + goto out; +
Re: [PATCH 3/3] KVM: Enable MSI-X for KVM assigned device
Sheng Yang wrote: This patch finally enable MSI-X. What we need for MSI-X: 1. Intercept one page in MMIO region of device. So that we can get guest desired MSI-X table and set up the real one. Now this have been done by guest, and transfer to kernel using ioctl KVM_SET_MSIX_NR and KVM_SET_MSIX_ENTRY. 2. Information for incoming interrupt. Now one device can have more than one interrupt, and they are all handled by one workqueue structure. So we need to identify them. The previous patch enable gsi_msg_pending_bitmap get this done. 3. Mapping from host IRQ to guest gsi as well as guest gsi to real MSI/MSI-X message address/data. We used same entry number for the host and guest here, so that it's easy to find the correlated guest gsi. What we lack for now: 1. The PCI spec said nothing can existed with MSI-X table in the same page of MMIO region, except pending bits. The patch ignore pending bits as the first step (so they are always 0 - no pending). 2. The PCI spec allowed to change MSI-X table dynamically. That means, the OS can enable MSI-X, then mask one MSI-X entry, modify it, and unmask it. The patch didn't support this, and Linux also don't work in this way. 3. The patch didn't implement MSI-X mask all and mask single entry. I would implement the former in driver/pci/msi.c later. And for single entry, userspace should have reposibility to handle it. Signed-off-by: Sheng Yang sh...@linux.intel.com --- include/linux/kvm.h |8 virt/kvm/kvm_main.c | 107 --- 2 files changed, 109 insertions(+), 6 deletions(-) diff --git a/include/linux/kvm.h b/include/linux/kvm.h index a2dfbe0..78480d0 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -440,6 +440,9 @@ struct kvm_irq_routing { }; #endif +#if defined(CONFIG_X86) +#define KVM_CAP_DEVICE_MSIX 26 +#endif We switched to a different way of depending on CONFIG_X86, see the other KVM_CAP defines. struct kvm_assigned_msix_nr { __u32 assigned_dev_id; __u16 entry_nr; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 4010802..d3acb37 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -280,13 +280,33 @@ static void kvm_free_assigned_irq(struct kvm *kvm, * now, the kvm state is still legal for probably we also have to wait * interrupt_work done. */ - disable_irq_nosync(assigned_dev-host_irq); - cancel_work_sync(assigned_dev-interrupt_work); + if (assigned_dev-irq_requested_type KVM_ASSIGNED_DEV_MSIX) { + int i; + for (i = 0; i assigned_dev-entries_nr; i++) + disable_irq_nosync(assigned_dev- + host_msix_entries[i].vector); + + cancel_work_sync(assigned_dev-interrupt_work); + + for (i = 0; i assigned_dev-entries_nr; i++) + free_irq(assigned_dev-host_msix_entries[i].vector, +(void *)assigned_dev); + + assigned_dev-entries_nr = 0; + kfree(assigned_dev-host_msix_entries); + kfree(assigned_dev-guest_msix_entries); + pci_disable_msix(assigned_dev-dev); + } else { + /* Deal with MSI and INTx */ + disable_irq_nosync(assigned_dev-host_irq); + cancel_work_sync(assigned_dev-interrupt_work); How about always have an array? That will also allow us to deal with INTx where x=B,C,D. Currently for MSI and INTx the array will hold just one active element. - free_irq(assigned_dev-host_irq, (void *)assigned_dev); + free_irq(assigned_dev-host_irq, (void *)assigned_dev); - if (assigned_dev-irq_requested_type KVM_ASSIGNED_DEV_HOST_MSI) - pci_disable_msi(assigned_dev-dev); + if (assigned_dev-irq_requested_type + KVM_ASSIGNED_DEV_HOST_MSI) + pci_disable_msi(assigned_dev-dev); + } All those flags and bits are worrying me. Maybe each entry in the array can have an ops member, and disabling would work by calling -ops-disable(). We can do that later. assigned_dev-irq_requested_type = 0; } @@ -415,6 +435,69 @@ static int assigned_device_update_msi(struct kvm *kvm, adev-irq_requested_type |= KVM_ASSIGNED_DEV_HOST_MSI; return 0; } + +static int assigned_device_update_msix(struct kvm *kvm, + struct kvm_assigned_dev_kernel *adev, + struct kvm_assigned_irq *airq) +{ + /* TODO Deal with KVM_DEV_IRQ_ASSIGNED_MASK_MSIX */ + int i, r; + + adev-ack_notifier.gsi = -1; + + if (irqchip_in_kernel(kvm)) { + if (airq-flags KVM_DEV_IRQ_ASSIGN_MASK_MSIX) { + printk(KERN_WARNING + kvm: unsupported mask MSI-X, flags 0x%x!\n, + airq-flags); error, not
Re: [PATCH 2/3] KVM: Add gsi_msg_pending_bitmap for MSI-X
Sheng Yang wrote: We have to handle more than one interrupt with one handler for MSI-X. So we need a bitmap to track the triggered interrupts. Can you explain why? -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] KVM: Add gsi_msg_pending_bitmap for MSI-X
On Wednesday 18 February 2009 19:00:53 Avi Kivity wrote: Sheng Yang wrote: We have to handle more than one interrupt with one handler for MSI-X. So we need a bitmap to track the triggered interrupts. Can you explain why? Or how can we know which interrupt happened? Current we scheduled the work later, and no more irq information available at that time. -- regards Yang, Sheng -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Current KVM head crashes on startup
On (Wed) Feb 18 2009 [10:19:44], Avi Kivity wrote: Amit Shah wrote: The first version generates too much register pressure for some compilers on i386, leading to compilation failures. The second version Is it still valid? I tried with gcc-4.1.2 and that worked fine with the first version. Should we just use that version instead? I don't see why it would change, unless you can destroy all copies of the compilers that fail with it. I'd like to know which compilers fail to compile it -- maintaining specific code can introduce such regressions. qemu too doesn't have a dependency on gcc-3 anymore. Also, softwares do periodically bump up the minimum required versions of their dependencies. Amit -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Current KVM head crashes on startup
Amit Shah wrote: I don't see why it would change, unless you can destroy all copies of the compilers that fail with it. I'd like to know which compilers fail to compile it I don't recall, it probably depends on whether frame pointers are used or not as well. -- maintaining specific code can introduce such regressions. That's a problem with assembly. x86 and x86_64 are different instruction sets. qemu too doesn't have a dependency on gcc-3 anymore. We aren't forcing users to use gcc 4. Also, softwares do periodically bump up the minimum required versions of their dependencies. Not for this kind of bug. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] KVM: Add gsi_msg_pending_bitmap for MSI-X
Sheng Yang wrote: On Wednesday 18 February 2009 19:00:53 Avi Kivity wrote: Sheng Yang wrote: We have to handle more than one interrupt with one handler for MSI-X. So we need a bitmap to track the triggered interrupts. Can you explain why? Or how can we know which interrupt happened? Current we scheduled the work later, and no more irq information available at that time. We can have a work_struct per interrupt, or we can set a flag in the msix array that the interrupt is pending. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] KVM: Add gsi_msg_pending_bitmap for MSI-X
On Wednesday 18 February 2009 19:29:28 Avi Kivity wrote: Sheng Yang wrote: On Wednesday 18 February 2009 19:00:53 Avi Kivity wrote: Sheng Yang wrote: We have to handle more than one interrupt with one handler for MSI-X. So we need a bitmap to track the triggered interrupts. Can you explain why? Or how can we know which interrupt happened? Current we scheduled the work later, and no more irq information available at that time. We can have a work_struct per interrupt, or we can set a flag in the msix array that the interrupt is pending. As I know, work_struct itself don't take any data. And host MSI-X array is a type of msix_entry* which is used for pci_enable_msix. But modifying type of guest msix entries should be OK. I will try. -- regards Yang, Sheng -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: copyless virtio net thoughts?
On Wed, Feb 18, 2009 at 10:08:00PM +1030, Rusty Russell wrote: 4) Multiple queues This is Herbert's. Should be fairly simple to add; it was in the back of my mind when we started. Not sure whether the queues should be static or dynamic (imagine direct interguest networking, one queue pair for each other guest), and how xmit queues would be selected by the guest (anything anywhere, or dst mac?). The primary purpose of multiple queues is to maximise CPU utilisation, so the number of queues is simply dependent on the number of CPUs allotted to the guest. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmVHI~} herb...@gondor.apana.org.au Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Current KVM head crashes on startup
On (Wed) Feb 18 2009 [11:26:42], Avi Kivity wrote: Amit Shah wrote: I don't see why it would change, unless you can destroy all copies of the compilers that fail with it. I'd like to know which compilers fail to compile it I don't recall, it probably depends on whether frame pointers are used or not as well. As far as I know, kvm-userspace build arguments have remained the same for quite some time. Also, we still pass the -g flag for userspace compilations. -- maintaining specific code can introduce such regressions. That's a problem with assembly. x86 and x86_64 are different instruction sets. But the code in question isn't different on the two architectures. Just a cpuid call that hasn't changed. qemu too doesn't have a dependency on gcc-3 anymore. We aren't forcing users to use gcc 4. Also, softwares do periodically bump up the minimum required versions of their dependencies. Not for this kind of bug. Just enumerating why just destroying all the copies isn't the only option :-) OK, given a patch to have just one version of the cpuid call, would you be willing to take the risk of finding out which users it breaks for? I'll send patches to revert and restore correct behaviour for 32-bit if that does happen. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] KVM: Enable MSI-X for KVM assigned device
On Wed, Feb 18, 2009 at 10:45:19AM +, Avi Kivity wrote: Sheng Yang wrote: index a2dfbe0..78480d0 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -440,6 +440,9 @@ struct kvm_irq_routing { }; #endif +#if defined(CONFIG_X86) +#define KVM_CAP_DEVICE_MSIX 26 +#endif We switched to a different way of depending on CONFIG_X86, see the other KVM_CAP defines. Thanks to point it out. :) struct kvm_assigned_msix_nr { __u32 assigned_dev_id; __u16 entry_nr; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 4010802..d3acb37 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -280,13 +280,33 @@ static void kvm_free_assigned_irq(struct kvm *kvm, * now, the kvm state is still legal for probably we also have to wait * interrupt_work done. */ -disable_irq_nosync(assigned_dev-host_irq); -cancel_work_sync(assigned_dev-interrupt_work); +if (assigned_dev-irq_requested_type KVM_ASSIGNED_DEV_MSIX) { +int i; +for (i = 0; i assigned_dev-entries_nr; i++) +disable_irq_nosync(assigned_dev- + host_msix_entries[i].vector); + +cancel_work_sync(assigned_dev-interrupt_work); + +for (i = 0; i assigned_dev-entries_nr; i++) +free_irq(assigned_dev-host_msix_entries[i].vector, + (void *)assigned_dev); + +assigned_dev-entries_nr = 0; +kfree(assigned_dev-host_msix_entries); +kfree(assigned_dev-guest_msix_entries); +pci_disable_msix(assigned_dev-dev); +} else { +/* Deal with MSI and INTx */ +disable_irq_nosync(assigned_dev-host_irq); +cancel_work_sync(assigned_dev-interrupt_work); How about always have an array? That will also allow us to deal with INTx where x=B,C,D. Currently for MSI and INTx the array will hold just one active element. So array, or bitmap? I remember I changed it to bitmap accounding to your first comment... OK. I think array is reasonable, but the length is a problem, as I did before. How long would you like? -- regards Yang, Sheng -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Current KVM head crashes on startup
Amit Shah wrote: I don't recall, it probably depends on whether frame pointers are used or not as well. As far as I know, kvm-userspace build arguments have remained the same for quite some time. Also, we still pass the -g flag for userspace compilations. Some distros change CFLAGS. -- maintaining specific code can introduce such regressions. That's a problem with assembly. x86 and x86_64 are different instruction sets. But the code in question isn't different on the two architectures. Just a cpuid call that hasn't changed. The amount of available registers is different, as is the specification of which registers may be clobbered. OK, given a patch to have just one version of the cpuid call, would you be willing to take the risk of finding out which users it breaks for? I'll send patches to revert and restore correct behaviour for 32-bit if that does happen. This is in upstream qemu so it's not my call but I wouldn't recommend breaking the build when a trivial one liner is possible. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] KVM: Enable MSI-X for KVM assigned device
Sheng Yang wrote: struct kvm_assigned_msix_nr { __u32 assigned_dev_id; __u16 entry_nr; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 4010802..d3acb37 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -280,13 +280,33 @@ static void kvm_free_assigned_irq(struct kvm *kvm, * now, the kvm state is still legal for probably we also have to wait * interrupt_work done. */ - disable_irq_nosync(assigned_dev-host_irq); - cancel_work_sync(assigned_dev-interrupt_work); + if (assigned_dev-irq_requested_type KVM_ASSIGNED_DEV_MSIX) { + int i; + for (i = 0; i assigned_dev-entries_nr; i++) + disable_irq_nosync(assigned_dev- + host_msix_entries[i].vector); + + cancel_work_sync(assigned_dev-interrupt_work); + + for (i = 0; i assigned_dev-entries_nr; i++) + free_irq(assigned_dev-host_msix_entries[i].vector, +(void *)assigned_dev); + + assigned_dev-entries_nr = 0; + kfree(assigned_dev-host_msix_entries); + kfree(assigned_dev-guest_msix_entries); + pci_disable_msix(assigned_dev-dev); + } else { + /* Deal with MSI and INTx */ + disable_irq_nosync(assigned_dev-host_irq); + cancel_work_sync(assigned_dev-interrupt_work); How about always have an array? That will also allow us to deal with INTx where x=B,C,D. Currently for MSI and INTx the array will hold just one active element. So array, or bitmap? I remember I changed it to bitmap accounding to your first comment... Which bitmap? I'm confused. I'm talking about unifying the existing array (assigned_dev-host_msix_entries[]) with -host_irq. Also since we need an array for INTx when a function uses INT[BCD]. So we'll have assigned_dev-host_irqs[], each entry can be INTx or MSI or MSIx. OK. I think array is reasonable, but the length is a problem, as I did before. How long would you like? MAX(4, KVM_MAX_MSIX_ENTRIES), no? -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/7] [V3] kvm: qemu: fix hot remove assigned device with iommu
Han, Weidong wrote: device assignment hotplug doesn't work on current tree. I had a quick glance, found device assignment didn't be considered, qemu_system_hot_assign_device is not used at all. I will fix it. I noticed that during the merge, but wasn't familiar in the code to fix it myself. Thanks for taking care of it. We should work to merge the code in upstream qemu so that this doesn't happen again. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Recent kvm and vmware server comparisons?
On Tuesday 10 February 2009, Thomas Fjellstrom wrote: I've temporarily got vmware server running on my new server, and intend to migrate over to kvm as soon as possible, if it provides enough incentive (extra performance, features). Currently I'm waiting for full iommu support in the kernel, modules and userspace, and didn't plan to migrate till I had hardware that could do iommu, kvm fully supported iommu + DMA for devices passed through, could also pass through more than one device per guest (I saw hints that the intel iommu implementation can only do one device per guest? please tell me I'm wrong, it seems like an odd design choice to make), and full migration. But if I can get enough performance over vmware server 2 with plain old kvm + virtio, I'd happily migrate. I saw a message late last year comparing the two, but I know how quickly things change in the OSS world, and I also intend to use raw devices (possibly AoE) for guest disks (not qcow or anything like it), and virtio for networking. So has anyone tested the two lately? Got any experiences you'd like to share? I suppose no-one has any? -- Thomas Fjellstrom tfjellst...@shaw.ca -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] KVM: Enable MSI-X for KVM assigned device
On Wednesday 18 February 2009 20:36:10 Avi Kivity wrote: Sheng Yang wrote: struct kvm_assigned_msix_nr { __u32 assigned_dev_id; __u16 entry_nr; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 4010802..d3acb37 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -280,13 +280,33 @@ static void kvm_free_assigned_irq(struct kvm *kvm, * now, the kvm state is still legal for probably we also have to wait * interrupt_work done. */ - disable_irq_nosync(assigned_dev-host_irq); - cancel_work_sync(assigned_dev-interrupt_work); + if (assigned_dev-irq_requested_type KVM_ASSIGNED_DEV_MSIX) { + int i; + for (i = 0; i assigned_dev-entries_nr; i++) + disable_irq_nosync(assigned_dev- +host_msix_entries[i].vector); + + cancel_work_sync(assigned_dev-interrupt_work); + + for (i = 0; i assigned_dev-entries_nr; i++) + free_irq(assigned_dev-host_msix_entries[i].vector, + (void *)assigned_dev); + + assigned_dev-entries_nr = 0; + kfree(assigned_dev-host_msix_entries); + kfree(assigned_dev-guest_msix_entries); + pci_disable_msix(assigned_dev-dev); + } else { + /* Deal with MSI and INTx */ + disable_irq_nosync(assigned_dev-host_irq); + cancel_work_sync(assigned_dev-interrupt_work); How about always have an array? That will also allow us to deal with INTx where x=B,C,D. Currently for MSI and INTx the array will hold just one active element. So array, or bitmap? I remember I changed it to bitmap accounding to your first comment... Which bitmap? I'm confused. I'm talking about unifying the existing array (assigned_dev-host_msix_entries[]) with -host_irq. Also since we need an array for INTx when a function uses INT[BCD]. So we'll have assigned_dev-host_irqs[], each entry can be INTx or MSI or MSIx. OK. I think array is reasonable, but the length is a problem, as I did before. How long would you like? MAX(4, KVM_MAX_MSIX_ENTRIES), no? Oh, yeah, I misunderstood it(wrong context)... Need more adjustment on the type, for host_msix_entries is used with pci_enable_msix. So I'd like to put it a bit later. -- regards Yang, Sheng -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/3] KVM SoftMMU fixes
Hi Avi, Marcelo, this small patch series fixes two issues and include one cleanup I ran into hacking in the KVM SoftMMU code. Please consider to apply. Joerg diffstat: arch/x86/kvm/mmu.c | 10 +++--- virt/kvm/kvm_main.c |6 -- 2 files changed, 7 insertions(+), 9 deletions(-) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/3] kvm mmu: remove redundant check in mmu_set_spte
The following code flow is unnecessary: if (largepage) was_rmapped = is_large_pte(*shadow_pte); else was_rmapped = 1; The is_large_pte() function will always evaluate to one here because the (largepage !is_large_pte) case is already handled in the first if-clause. So we can remove this check and set was_rmapped to one always here. Signed-off-by: Joerg Roedel joerg.roe...@amd.com --- arch/x86/kvm/mmu.c |8 ++-- 1 files changed, 2 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index ef060ec..c90b4b2 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -1791,12 +1791,8 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *shadow_pte, pgprintk(hfn old %lx new %lx\n, spte_to_pfn(*shadow_pte), pfn); rmap_remove(vcpu-kvm, shadow_pte); - } else { - if (largepage) - was_rmapped = is_large_pte(*shadow_pte); - else - was_rmapped = 1; - } + } else + was_rmapped = 1; } if (set_spte(vcpu, shadow_pte, pte_access, user_fault, write_fault, dirty, largepage, global, gfn, pfn, speculative, true)) { -- 1.5.6.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/3] kvm mmu: handle compound pages in kvm_is_mmio_pfn
The function kvm_is_mmio_pfn is called before put_page is called on a page by KVM. This is a problem when when this function is called on some struct page which is part of a compund page. It does not test the reserved flag of the compound page but of the struct page within the compount page. This is a problem when KVM works with hugepages allocated at boot time. These pages have the reserved bit set in all tail pages. Only the flag in the compount head is cleared. KVM would not put such a page which results in a memory leak. Signed-off-by: Joerg Roedel joerg.roe...@amd.com --- virt/kvm/kvm_main.c |6 -- 1 files changed, 4 insertions(+), 2 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 266bdaf..0ed662d 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -535,8 +535,10 @@ static inline int valid_vcpu(int n) inline int kvm_is_mmio_pfn(pfn_t pfn) { - if (pfn_valid(pfn)) - return PageReserved(pfn_to_page(pfn)); + if (pfn_valid(pfn)) { + struct page *page = compound_head(pfn_to_page(pfn)); + return PageReserved(page); + } return true; } -- 1.5.6.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] kvm mmu: alloc shadow pages with __GFP_ZERO
Not using __GFP_ZERO when allocating shadow pages triggers the assertion in the kvm_mmu_alloc_page() when MMU debugging is enabled. Signed-off-by: Joerg Roedel joerg.roe...@amd.com --- arch/x86/kvm/mmu.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index c90b4b2..d93ecec 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -301,7 +301,7 @@ static int mmu_topup_memory_cache_page(struct kvm_mmu_memory_cache *cache, if (cache-nobjs = min) return 0; while (cache-nobjs ARRAY_SIZE(cache-objects)) { - page = alloc_page(GFP_KERNEL); + page = alloc_page(GFP_KERNEL | __GFP_ZERO); if (!page) return -ENOMEM; set_page_private(page, 0); -- 1.5.6.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] kvm mmu: alloc shadow pages with __GFP_ZERO
Joerg Roedel wrote: Not using __GFP_ZERO when allocating shadow pages triggers the assertion in the kvm_mmu_alloc_page() when MMU debugging is enabled. Signed-off-by: Joerg Roedel joerg.roe...@amd.com --- arch/x86/kvm/mmu.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index c90b4b2..d93ecec 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -301,7 +301,7 @@ static int mmu_topup_memory_cache_page(struct kvm_mmu_memory_cache *cache, if (cache-nobjs = min) return 0; while (cache-nobjs ARRAY_SIZE(cache-objects)) { - page = alloc_page(GFP_KERNEL); + page = alloc_page(GFP_KERNEL | __GFP_ZERO); if (!page) return -ENOMEM; set_page_private(page, 0); What is the warning? Adding __GFP_ZERO here will cause us to clear the page twice, which is wasteful. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] kvm mmu: alloc shadow pages with __GFP_ZERO
Joerg Roedel wrote: The assertion which the attached patch removes fails sometimes. Removing this assertion is the alternative solution to this problem ;-) From ca45f3a2e45cd7e76ca624bb1098329db8ff83ab Mon Sep 17 00:00:00 2001 From: Joerg Roedel joerg.roe...@amd.com Date: Wed, 18 Feb 2009 14:51:13 +0100 Subject: [PATCH] kvm mmu: remove assertion in kvm_mmu_alloc_page Signed-off-by: Joerg Roedel joerg.roe...@amd.com --- arch/x86/kvm/mmu.c |1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index d93ecec..b226973 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -802,7 +802,6 @@ static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu, set_page_private(virt_to_page(sp-spt), (unsigned long)sp); list_add(sp-link, vcpu-kvm-arch.active_mmu_pages); INIT_LIST_HEAD(sp-oos_link); - ASSERT(is_empty_shadow_page(sp-spt)); bitmap_zero(sp-slot_bitmap, KVM_MEMORY_SLOTS + KVM_PRIVATE_MEM_SLOTS); sp-multimapped = 0; sp-parent_pte = parent_pte; sp-spt is allocated using mmu_memory_cache_alloc(), which zeros the page. How can the assertion fail? -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] kvm mmu: alloc shadow pages with __GFP_ZERO
On Wed, Feb 18, 2009 at 01:47:04PM +, Avi Kivity wrote: Joerg Roedel wrote: Not using __GFP_ZERO when allocating shadow pages triggers the assertion in the kvm_mmu_alloc_page() when MMU debugging is enabled. Signed-off-by: Joerg Roedel joerg.roe...@amd.com --- arch/x86/kvm/mmu.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index c90b4b2..d93ecec 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -301,7 +301,7 @@ static int mmu_topup_memory_cache_page(struct kvm_mmu_memory_cache *cache, if (cache-nobjs = min) return 0; while (cache-nobjs ARRAY_SIZE(cache-objects)) { -page = alloc_page(GFP_KERNEL); +page = alloc_page(GFP_KERNEL | __GFP_ZERO); if (!page) return -ENOMEM; set_page_private(page, 0); What is the warning? Adding __GFP_ZERO here will cause us to clear the page twice, which is wasteful. The assertion which the attached patch removes fails sometimes. Removing this assertion is the alternative solution to this problem ;-) From ca45f3a2e45cd7e76ca624bb1098329db8ff83ab Mon Sep 17 00:00:00 2001 From: Joerg Roedel joerg.roe...@amd.com Date: Wed, 18 Feb 2009 14:51:13 +0100 Subject: [PATCH] kvm mmu: remove assertion in kvm_mmu_alloc_page Signed-off-by: Joerg Roedel joerg.roe...@amd.com --- arch/x86/kvm/mmu.c |1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index d93ecec..b226973 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -802,7 +802,6 @@ static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu, set_page_private(virt_to_page(sp-spt), (unsigned long)sp); list_add(sp-link, vcpu-kvm-arch.active_mmu_pages); INIT_LIST_HEAD(sp-oos_link); - ASSERT(is_empty_shadow_page(sp-spt)); bitmap_zero(sp-slot_bitmap, KVM_MEMORY_SLOTS + KVM_PRIVATE_MEM_SLOTS); sp-multimapped = 0; sp-parent_pte = parent_pte; -- 1.5.6.4 -- | Advanced Micro Devices GmbH Operating | Karl-Hammerschmidt-Str. 34, 85609 Dornach bei MĂ¼nchen System| Research | GeschäftsfĂ¼hrer: Jochen Polster, Thomas M. McCoy, Giuliano Meroni Center| Sitz: Dornach, Gemeinde Aschheim, Landkreis MĂ¼nchen | Registergericht MĂ¼nchen, HRB Nr. 43632 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] kvm mmu: alloc shadow pages with __GFP_ZERO
On Wed, Feb 18, 2009 at 02:03:34PM +, Avi Kivity wrote: Joerg Roedel wrote: The assertion which the attached patch removes fails sometimes. Removing this assertion is the alternative solution to this problem ;-) From ca45f3a2e45cd7e76ca624bb1098329db8ff83ab Mon Sep 17 00:00:00 2001 From: Joerg Roedel joerg.roe...@amd.com Date: Wed, 18 Feb 2009 14:51:13 +0100 Subject: [PATCH] kvm mmu: remove assertion in kvm_mmu_alloc_page Signed-off-by: Joerg Roedel joerg.roe...@amd.com --- arch/x86/kvm/mmu.c |1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index d93ecec..b226973 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -802,7 +802,6 @@ static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu, set_page_private(virt_to_page(sp-spt), (unsigned long)sp); list_add(sp-link, vcpu-kvm-arch.active_mmu_pages); INIT_LIST_HEAD(sp-oos_link); -ASSERT(is_empty_shadow_page(sp-spt)); bitmap_zero(sp-slot_bitmap, KVM_MEMORY_SLOTS + KVM_PRIVATE_MEM_SLOTS); sp-multimapped = 0; sp-parent_pte = parent_pte; sp-spt is allocated using mmu_memory_cache_alloc(), which zeros the page. How can the assertion fail? In the code I see (current kvm-git) mmu_memory_cache_alloc() does zero nothing. It takes the page from the preallocated pool and returns it. The pool itself is filled with mmu_topup_memory_caches() which calls mmu_topup_memory_cache_page() to fill the mmu_page_cache (from which the sp-spt page is allocated later). And the mmu_topup_memory_cache_page() function calls alloc_page() and does not zero the result. This let the assertion trigger. Joerg -- | Advanced Micro Devices GmbH Operating | Karl-Hammerschmidt-Str. 34, 85609 Dornach bei MĂ¼nchen System| Research | GeschäftsfĂ¼hrer: Jochen Polster, Thomas M. McCoy, Giuliano Meroni Center| Sitz: Dornach, Gemeinde Aschheim, Landkreis MĂ¼nchen | Registergericht MĂ¼nchen, HRB Nr. 43632 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] kvm mmu: alloc shadow pages with __GFP_ZERO
Joerg Roedel wrote: sp-spt is allocated using mmu_memory_cache_alloc(), which zeros the page. How can the assertion fail? In the code I see (current kvm-git) mmu_memory_cache_alloc() does zero nothing. It takes the page from the preallocated pool and returns it. The pool itself is filled with mmu_topup_memory_caches() which calls mmu_topup_memory_cache_page() to fill the mmu_page_cache (from which the sp-spt page is allocated later). And the mmu_topup_memory_cache_page() function calls alloc_page() and does not zero the result. This let the assertion trigger. Right, I was looking at the 2.6.29 tree. The patch is correct (and the others look good as well). As usual, I'd like Marcelo to take a look as well. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/3 v4] MSI-X enabling
-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/4] KVM: Ioctls for init MSI-X entry
Introduce KVM_SET_MSIX_NR and KVM_SET_MSIX_ENTRY two ioctls. This two ioctls are used by userspace to specific guest device MSI-X entry number and correlate MSI-X entry with GSI during the initialization stage. MSI-X should be well initialzed before enabling. Don't support change MSI-X entry number for now. Signed-off-by: Sheng Yang sh...@linux.intel.com --- include/linux/kvm.h | 18 include/linux/kvm_host.h | 10 virt/kvm/kvm_main.c | 104 ++ 3 files changed, 132 insertions(+), 0 deletions(-) diff --git a/include/linux/kvm.h b/include/linux/kvm.h index d742cbf..8e14629 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -475,6 +475,10 @@ struct kvm_irq_routing { #define KVM_ASSIGN_IRQ _IOW(KVMIO, 0x70, \ struct kvm_assigned_irq) #define KVM_REINJECT_CONTROL _IO(KVMIO, 0x71) +#define KVM_ASSIGN_SET_MSIX_NR \ + _IOW(KVMIO, 0x72, struct kvm_assigned_msix_nr) +#define KVM_ASSIGN_SET_MSIX_ENTRY \ + _IOW(KVMIO, 0x73, struct kvm_assigned_msix_entry) /* * ioctls for vcpu fds @@ -595,4 +599,18 @@ struct kvm_assigned_irq { #define KVM_DEV_IRQ_ASSIGN_MSI_ACTION KVM_DEV_IRQ_ASSIGN_ENABLE_MSI #define KVM_DEV_IRQ_ASSIGN_ENABLE_MSI (1 0) +struct kvm_assigned_msix_nr { + __u32 assigned_dev_id; + __u16 entry_nr; + __u16 padding; +}; + +#define KVM_MAX_MSIX_PER_DEV 512 +struct kvm_assigned_msix_entry { + __u32 assigned_dev_id; + __u32 gsi; + __u16 entry; /* The index of entry in the MSI-X table */ + __u16 padding[3]; +}; + #endif diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 7c7096d..33ed9f8 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -319,6 +319,12 @@ struct kvm_irq_ack_notifier { void (*irq_acked)(struct kvm_irq_ack_notifier *kian); }; +struct kvm_guest_msix_entry { + u32 vector; + u16 entry; + u16 flags; +}; + struct kvm_assigned_dev_kernel { struct kvm_irq_ack_notifier ack_notifier; struct work_struct interrupt_work; @@ -326,13 +332,17 @@ struct kvm_assigned_dev_kernel { int assigned_dev_id; int host_busnr; int host_devfn; + unsigned int entries_nr; int host_irq; bool host_irq_disabled; + struct msix_entry *host_msix_entries; int guest_irq; + struct kvm_guest_msix_entry *guest_msix_entries; #define KVM_ASSIGNED_DEV_GUEST_INTX(1 0) #define KVM_ASSIGNED_DEV_GUEST_MSI (1 1) #define KVM_ASSIGNED_DEV_HOST_INTX (1 8) #define KVM_ASSIGNED_DEV_HOST_MSI (1 9) +#define KVM_ASSIGNED_DEV_MSIX ((1 2) | (1 10)) unsigned long irq_requested_type; int irq_source_id; int flags; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 266bdaf..b373466 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1593,6 +1593,88 @@ static int kvm_vcpu_ioctl_set_sigmask(struct kvm_vcpu *vcpu, sigset_t *sigset) return 0; } +#ifdef __KVM_HAVE_MSIX +static int kvm_vm_ioctl_set_msix_nr(struct kvm *kvm, + struct kvm_assigned_msix_nr *entry_nr) +{ + int r = 0; + struct kvm_assigned_dev_kernel *adev; + + mutex_lock(kvm-lock); + + adev = kvm_find_assigned_dev(kvm-arch.assigned_dev_head, + entry_nr-assigned_dev_id); + if (!adev) { + r = -EINVAL; + goto msix_nr_out; + } + + if (adev-entries_nr == 0) { + adev-entries_nr = entry_nr-entry_nr; + if (adev-entries_nr == 0 || + adev-entries_nr = KVM_MAX_MSIX_PER_DEV) { + r = -EINVAL; + goto msix_nr_out; + } + + adev-host_msix_entries = kzalloc(sizeof(struct msix_entry) * + entry_nr-entry_nr, + GFP_KERNEL); + if (!adev-host_msix_entries) { + r = -ENOMEM; + goto msix_nr_out; + } + adev-guest_msix_entries = kzalloc( + sizeof(struct kvm_guest_msix_entry) * + entry_nr-entry_nr, GFP_KERNEL); + if (!adev-guest_msix_entries) { + kfree(adev-host_msix_entries); + r = -ENOMEM; + goto msix_nr_out; + } + } else /* Not allowed set MSI-X number twice */ + r = -EINVAL; +msix_nr_out: + mutex_unlock(kvm-lock); + return r; +} + +static int kvm_vm_ioctl_set_msix_entry(struct kvm *kvm, + struct kvm_assigned_msix_entry *entry) +{ + int r = 0, i; + struct kvm_assigned_dev_kernel *adev; + +
[PATCH 3/4] KVM: Add MSI-X interrupt injection logic
We have to handle more than one interrupt with one handler for MSI-X. Avi suggested to use a flag to indicate the pending. So here is it. Signed-off-by: Sheng Yang sh...@linux.intel.com --- include/linux/kvm_host.h |1 + virt/kvm/kvm_main.c | 66 +- 2 files changed, 60 insertions(+), 7 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 33ed9f8..5aad46a 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -319,6 +319,7 @@ struct kvm_irq_ack_notifier { void (*irq_acked)(struct kvm_irq_ack_notifier *kian); }; +#define KVM_ASSIGNED_MSIX_PENDING 0x1 struct kvm_guest_msix_entry { u32 vector; u16 entry; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index b373466..1e80b6e 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -95,25 +95,69 @@ static struct kvm_assigned_dev_kernel *kvm_find_assigned_dev(struct list_head *h return NULL; } +static int find_index_from_host_irq(struct kvm_assigned_dev_kernel + *assigned_dev, int irq) +{ + int i, index; + struct msix_entry *host_msix_entries; + + host_msix_entries = assigned_dev-host_msix_entries; + + index = -1; + for (i = 0; i assigned_dev-entries_nr; i++) + if (irq == host_msix_entries[i].vector) { + index = i; + break; + } + if (index 0) { + printk(KERN_WARNING Fail to find correlated MSI-X entry!\n); + return 0; + } + + return index; +} + static void kvm_assigned_dev_interrupt_work_handler(struct work_struct *work) { struct kvm_assigned_dev_kernel *assigned_dev; + struct kvm *kvm; + int irq, i; assigned_dev = container_of(work, struct kvm_assigned_dev_kernel, interrupt_work); + kvm = assigned_dev-kvm; /* This is taken to safely inject irq inside the guest. When * the interrupt injection (or the ioapic code) uses a * finer-grained lock, update this */ - mutex_lock(assigned_dev-kvm-lock); - kvm_set_irq(assigned_dev-kvm, assigned_dev-irq_source_id, - assigned_dev-guest_irq, 1); - - if (assigned_dev-irq_requested_type KVM_ASSIGNED_DEV_GUEST_MSI) { - enable_irq(assigned_dev-host_irq); - assigned_dev-host_irq_disabled = false; + mutex_lock(kvm-lock); + if (assigned_dev-irq_requested_type KVM_ASSIGNED_DEV_MSIX) { + struct kvm_guest_msix_entry *guest_entries = + assigned_dev-guest_msix_entries; + for (i = 0; i assigned_dev-entries_nr; i++) { + if (!(guest_entries[i].flags + KVM_ASSIGNED_MSIX_PENDING)) + continue; + guest_entries[i].flags = ~KVM_ASSIGNED_MSIX_PENDING; + kvm_set_irq(assigned_dev-kvm, + assigned_dev-irq_source_id, + guest_entries[i].vector, 1); + irq = assigned_dev-host_msix_entries[i].vector; + if (irq != 0) + enable_irq(irq); + assigned_dev-host_irq_disabled = false; + } + } else { + kvm_set_irq(assigned_dev-kvm, assigned_dev-irq_source_id, + assigned_dev-guest_irq, 1); + if (assigned_dev-irq_requested_type + KVM_ASSIGNED_DEV_GUEST_MSI) { + enable_irq(assigned_dev-host_irq); + assigned_dev-host_irq_disabled = false; + } } + mutex_unlock(assigned_dev-kvm-lock); } @@ -122,6 +166,14 @@ static irqreturn_t kvm_assigned_dev_intr(int irq, void *dev_id) struct kvm_assigned_dev_kernel *assigned_dev = (struct kvm_assigned_dev_kernel *) dev_id; + if (assigned_dev-irq_requested_type == KVM_ASSIGNED_DEV_MSIX) { + int index = find_index_from_host_irq(assigned_dev, irq); + if (index 0) + return IRQ_HANDLED; + assigned_dev-guest_msix_entries[index].flags |= + KVM_ASSIGNED_MSIX_PENDING; + } + schedule_work(assigned_dev-interrupt_work); disable_irq_nosync(irq); -- 1.5.4.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/4] KVM: Enable MSI-X for KVM assigned device
This patch finally enable MSI-X. What we need for MSI-X: 1. Intercept one page in MMIO region of device. So that we can get guest desired MSI-X table and set up the real one. Now this have been done by guest, and transfer to kernel using ioctl KVM_SET_MSIX_NR and KVM_SET_MSIX_ENTRY. 2. Information for incoming interrupt. Now one device can have more than one interrupt, and they are all handled by one workqueue structure. So we need to identify them. The previous patch enable gsi_msg_pending_bitmap get this done. 3. Mapping from host IRQ to guest gsi as well as guest gsi to real MSI/MSI-X message address/data. We used same entry number for the host and guest here, so that it's easy to find the correlated guest gsi. What we lack for now: 1. The PCI spec said nothing can existed with MSI-X table in the same page of MMIO region, except pending bits. The patch ignore pending bits as the first step (so they are always 0 - no pending). 2. The PCI spec allowed to change MSI-X table dynamically. That means, the OS can enable MSI-X, then mask one MSI-X entry, modify it, and unmask it. The patch didn't support this, and Linux also don't work in this way. 3. The patch didn't implement MSI-X mask all and mask single entry. I would implement the former in driver/pci/msi.c later. And for single entry, userspace should have reposibility to handle it. Signed-off-by: Sheng Yang sh...@linux.intel.com --- arch/x86/include/asm/kvm.h |1 + include/linux/kvm.h|8 virt/kvm/kvm_main.c| 98 +--- 3 files changed, 101 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/kvm.h b/arch/x86/include/asm/kvm.h index dc3f6cf..125be8b 100644 --- a/arch/x86/include/asm/kvm.h +++ b/arch/x86/include/asm/kvm.h @@ -16,6 +16,7 @@ #define __KVM_HAVE_MSI #define __KVM_HAVE_USER_NMI #define __KVM_HAVE_GUEST_DEBUG +#define __KVM_HAVE_MSIX /* Architectural interrupt line count. */ #define KVM_NR_INTERRUPTS 256 diff --git a/include/linux/kvm.h b/include/linux/kvm.h index 8e14629..470a43c 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -402,6 +402,9 @@ struct kvm_trace_rec { #ifdef __KVM_HAVE_IOAPIC #define KVM_CAP_IRQ_ROUTING 25 #endif +#ifdef __KVM_HAVE_MSIX +#define KVM_CAP_DEVICE_MSIX 26 +#endif #ifdef KVM_CAP_IRQ_ROUTING @@ -599,6 +602,11 @@ struct kvm_assigned_irq { #define KVM_DEV_IRQ_ASSIGN_MSI_ACTION KVM_DEV_IRQ_ASSIGN_ENABLE_MSI #define KVM_DEV_IRQ_ASSIGN_ENABLE_MSI (1 0) +#define KVM_DEV_IRQ_ASSIGN_MSIX_ACTION (KVM_DEV_IRQ_ASSIGN_ENABLE_MSIX |\ + KVM_DEV_IRQ_ASSIGN_MASK_MSIX) +#define KVM_DEV_IRQ_ASSIGN_ENABLE_MSIX (1 1) +#define KVM_DEV_IRQ_ASSIGN_MASK_MSIX(1 2) + struct kvm_assigned_msix_nr { __u32 assigned_dev_id; __u16 entry_nr; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 1e80b6e..b1f2399 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -236,13 +236,33 @@ static void kvm_free_assigned_irq(struct kvm *kvm, * now, the kvm state is still legal for probably we also have to wait * interrupt_work done. */ - disable_irq_nosync(assigned_dev-host_irq); - cancel_work_sync(assigned_dev-interrupt_work); + if (assigned_dev-irq_requested_type KVM_ASSIGNED_DEV_MSIX) { + int i; + for (i = 0; i assigned_dev-entries_nr; i++) + disable_irq_nosync(assigned_dev- + host_msix_entries[i].vector); + + cancel_work_sync(assigned_dev-interrupt_work); - free_irq(assigned_dev-host_irq, (void *)assigned_dev); + for (i = 0; i assigned_dev-entries_nr; i++) + free_irq(assigned_dev-host_msix_entries[i].vector, +(void *)assigned_dev); - if (assigned_dev-irq_requested_type KVM_ASSIGNED_DEV_HOST_MSI) - pci_disable_msi(assigned_dev-dev); + assigned_dev-entries_nr = 0; + kfree(assigned_dev-host_msix_entries); + kfree(assigned_dev-guest_msix_entries); + pci_disable_msix(assigned_dev-dev); + } else { + /* Deal with MSI and INTx */ + disable_irq_nosync(assigned_dev-host_irq); + cancel_work_sync(assigned_dev-interrupt_work); + + free_irq(assigned_dev-host_irq, (void *)assigned_dev); + + if (assigned_dev-irq_requested_type + KVM_ASSIGNED_DEV_HOST_MSI) + pci_disable_msi(assigned_dev-dev); + } assigned_dev-irq_requested_type = 0; } @@ -373,6 +393,60 @@ static int assigned_device_update_msi(struct kvm *kvm, } #endif +#ifdef __KVM_HAVE_MSIX +static int assigned_device_update_msix(struct kvm *kvm, + struct kvm_assigned_dev_kernel *adev, + struct kvm_assigned_irq *airq) +{ +
[PATCH 1/4] KVM: Fix wrong usage of _IOR in assigned device interface
_IOR for copy_to_user and _IOW for copy_from_user... Noticed by Avi. Signed-off-by: Sheng Yang sh...@linux.intel.com --- include/linux/kvm.h |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/include/linux/kvm.h b/include/linux/kvm.h index 2163b3d..d742cbf 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -469,10 +469,10 @@ struct kvm_irq_routing { _IOW(KVMIO, 0x67, struct kvm_coalesced_mmio_zone) #define KVM_UNREGISTER_COALESCED_MMIO \ _IOW(KVMIO, 0x68, struct kvm_coalesced_mmio_zone) -#define KVM_ASSIGN_PCI_DEVICE _IOR(KVMIO, 0x69, \ +#define KVM_ASSIGN_PCI_DEVICE _IOW(KVMIO, 0x69, \ struct kvm_assigned_pci_dev) #define KVM_SET_GSI_ROUTING _IOW(KVMIO, 0x6a, struct kvm_irq_routing) -#define KVM_ASSIGN_IRQ _IOR(KVMIO, 0x70, \ +#define KVM_ASSIGN_IRQ _IOW(KVMIO, 0x70, \ struct kvm_assigned_irq) #define KVM_REINJECT_CONTROL _IO(KVMIO, 0x71) -- 1.5.4.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/4] KVM: Fix wrong usage of _IOR in assigned device interface
Sheng Yang wrote: _IOR for copy_to_user and _IOW for copy_from_user... Noticed by Avi. Signed-off-by: Sheng Yang sh...@linux.intel.com --- include/linux/kvm.h |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/include/linux/kvm.h b/include/linux/kvm.h index 2163b3d..d742cbf 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -469,10 +469,10 @@ struct kvm_irq_routing { _IOW(KVMIO, 0x67, struct kvm_coalesced_mmio_zone) #define KVM_UNREGISTER_COALESCED_MMIO \ _IOW(KVMIO, 0x68, struct kvm_coalesced_mmio_zone) -#define KVM_ASSIGN_PCI_DEVICE _IOR(KVMIO, 0x69, \ +#define KVM_ASSIGN_PCI_DEVICE _IOW(KVMIO, 0x69, \ struct kvm_assigned_pci_dev) #define KVM_SET_GSI_ROUTING _IOW(KVMIO, 0x6a, struct kvm_irq_routing) -#define KVM_ASSIGN_IRQ _IOR(KVMIO, 0x70, \ +#define KVM_ASSIGN_IRQ _IOW(KVMIO, 0x70, \ struct kvm_assigned_irq) #define KVM_REINJECT_CONTROL _IO(KVMIO, 0x71) KVM_ASSIGN_PCI_DEVICE was introduced in 2.6.28. We can't fix it since it's part of the ABI. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Houston, we have May 15, 1953 (says guest when host uses cpufreq, and dies)
Is using cpufreq (i.e. with ondemand governor) on KVM host safe for guests? I enabled cpufreq on the host, it scaled down the host CPU (Dual-Core AMD Opteron(tm) Processor 2212) to 1 GHz from 2 GHz. Guest (using 1 CPU) was still showing that it has a 2 GHz CPU in /proc/cpuinfo (I guess this value is read only once, when booting). After about 2 hours I started date on the guest - it showed that it's year *1953*, after which I couldn't start any other command (the guest was technically alive - SSH connection to it didn't die - but I couldn't do anything). # date Wed Feb 18 13:07:17 CET 2009 [let's wait ~2 hours] # date Fri May 15 10:13:14 CET 1953 # date ^C^Z [could not interrupt] Is it expected behaviour? Is it correct behaviour? -- Tomasz Chmielewski http://wpkg.org -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Houston, we have May 15, 1953 (says guest when host uses cpufreq, and dies)
Tomasz Chmielewski wrote: Is using cpufreq (i.e. with ondemand governor) on KVM host safe for guests? I enabled cpufreq on the host, it scaled down the host CPU (Dual-Core AMD Opteron(tm) Processor 2212) to 1 GHz from 2 GHz. Not with your processor. Intel processors should be fine and any AMD processor that's Barcelona/Phenom or newer. Regards, Anthony Liguori Guest (using 1 CPU) was still showing that it has a 2 GHz CPU in /proc/cpuinfo (I guess this value is read only once, when booting). After about 2 hours I started date on the guest - it showed that it's year *1953*, after which I couldn't start any other command (the guest was technically alive - SSH connection to it didn't die - but I couldn't do anything). # date Wed Feb 18 13:07:17 CET 2009 [let's wait ~2 hours] # date Fri May 15 10:13:14 CET 1953 # date ^C^Z [could not interrupt] Is it expected behaviour? Is it correct behaviour? -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Houston, we have May 15, 1953 (says guest when host uses cpufreq, and dies)
Anthony Liguori schrieb: Tomasz Chmielewski wrote: Is using cpufreq (i.e. with ondemand governor) on KVM host safe for guests? I enabled cpufreq on the host, it scaled down the host CPU (Dual-Core AMD Opteron(tm) Processor 2212) to 1 GHz from 2 GHz. Not with your processor. Intel processors should be fine and any AMD processor that's Barcelona/Phenom or newer. Looks I'm a bad, bad, anti-environment CO2 contributor then. From a technical perspective, what are the problems with my CPU that it scales down on the host just fine, but makes the guests return to the past? -- Tomasz Chmielewski http://wpkg.org -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: copyless virtio net thoughts?
On Wednesday 18 February 2009, Rusty Russell wrote: 2) Direct NIC attachment This is particularly interesting with SR-IOV or other multiqueue nics, but for boutique cases or benchmarks, could be for normal NICs. So far I have some very sketched-out patches: for the attached nic dev_alloc_skb() gets an skb from the guest (which supplies them via some kind of AIO interface), and a branch in netif_receive_skb() which returned it to the guest. This bypasses all firewalling in the host though; we're basically having the guest process drive the NIC directly. If this is not passing the PCI device directly to the guest, but uses your concept, wouldn't it still be possible to use the firewalling in the host? You can always inspect the headers, drop the frame, etc without copying the whole frame at any point. When it gets to the point of actually giving the (real pf or sr-iov vf) to one guest, you really get to the point where you can't do local firewalling any more. 3) Direct interguest networking Anthony has been thinking here: vmsplice has already been mentioned. The idea of passing directly from one guest to another is an interesting one: using dma engines might be possible too. Again, host can't firewall this traffic. Simplest as a dedicated internal lan NIC, but we could theoretically do a fast-path for certain MAC addresses on a general guest NIC. Another option would be to use an SR-IOV adapter from multiple guests, with a virtual ethernet bridge in the adapter. This moves the overhead from the CPU to the bus and/or adapter, so it may or may not be a real benefit depending on the workload. Arnd -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Houston, we have May 15, 1953 (says guest when host uses cpufreq, and dies)
Tomasz Chmielewski wrote: Looks I'm a bad, bad, anti-environment CO2 contributor then. From a technical perspective, what are the problems with my CPU that it scales down on the host just fine, but makes the guests return to the past? What kvm version are you using? kvm-84 should fix this. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Houston, we have May 15, 1953 (says guest when host uses cpufreq, and dies)
Avi Kivity schrieb: Tomasz Chmielewski wrote: Looks I'm a bad, bad, anti-environment CO2 contributor then. From a technical perspective, what are the problems with my CPU that it scales down on the host just fine, but makes the guests return to the past? What kvm version are you using? kvm-84 should fix this. I still have kvm-83. Thanks for the hint, I'll see how it works. -- Tomasz Chmielewski http://wpkg.org -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Current KVM head crashes on startup
Brian Kress wrote: When I try to run KVM built off the current head, it crashes with a Segmentation fault. KVM-84 does not. Seems to be dealing with the CPUID changes: 0x081a5c70 in host_cpuid () at /home/kressb/kvm/src/qemu/target-i386/helper.c:1426 1426asm volatile(pusha \n\t I've pushed a fix for this. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/8] kvm: qemu: fix hot assign device
Acked-by: Marcelo Tosatti mtosa...@redhat.com On Wed, Feb 18, 2009 at 03:12:31PM +0800, Han, Weidong wrote: Last qemu merge broke device assignment hotplug. Call qemu_pci_hot_assign_device in pci_device_hot_add for hot assign device, and add the command for it. for example hot assign 01:00.0, can use following command: pci_add pci_addr=auto host host=01:00.0 Signed-off-by: Weidong Han weidong@intel.com --- qemu/hw/device-hotplug.c | 37 - qemu/hw/pci-hotplug.c| 35 +++ qemu/monitor.c |2 +- 3 files changed, 36 insertions(+), 38 deletions(-) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/8] kvm: qemu: deassign device from guest
Weidong, Does this set fix http://sourceforge.net/tracker2/?func=detailaid=2432316group_id=180599atid=893831 On Wed, Feb 18, 2009 at 03:13:05PM +0800, Han, Weidong wrote: free_assigned_device just frees device from qemu, it should also deassign the device from guest when guest exits or hot remove assigned device. Acked-by: Mark McLoughlin mar...@redhat.com Signed-off-by: Weidong Han weidong@intel.com --- qemu/hw/device-assignment.c | 28 ++-- qemu/hw/device-assignment.h |1 + 2 files changed, 27 insertions(+), 2 deletions(-) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] kvm mmu: handle compound pages in kvm_is_mmio_pfn
BTW some page bits are erroneously transferred to the struct page's within the compound page. We've got away with that so far because these bits (such as dirty and accessed) are not used by the limited hugetlb/hugetlbfs implementation ATM. Acked-by: Marcelo Tosatti mtosa...@redhat.com On Wed, Feb 18, 2009 at 02:08:58PM +0100, Joerg Roedel wrote: The function kvm_is_mmio_pfn is called before put_page is called on a page by KVM. This is a problem when when this function is called on some struct page which is part of a compund page. It does not test the reserved flag of the compound page but of the struct page within the compount page. This is a problem when KVM works with hugepages allocated at boot time. These pages have the reserved bit set in all tail pages. Only the flag in the compount head is cleared. KVM would not put such a page which results in a memory leak. Signed-off-by: Joerg Roedel joerg.roe...@amd.com --- virt/kvm/kvm_main.c |6 -- 1 files changed, 4 insertions(+), 2 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 266bdaf..0ed662d 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -535,8 +535,10 @@ static inline int valid_vcpu(int n) inline int kvm_is_mmio_pfn(pfn_t pfn) { - if (pfn_valid(pfn)) - return PageReserved(pfn_to_page(pfn)); + if (pfn_valid(pfn)) { + struct page *page = compound_head(pfn_to_page(pfn)); + return PageReserved(page); + } return true; } -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] kvm mmu: remove redundant check in mmu_set_spte
The following code flow is unnecessary: if (largepage) was_rmapped = is_large_pte(*shadow_pte); else was_rmapped = 1; The is_large_pte() function will always evaluate to one here because the (largepage !is_large_pte) case is already handled in the first if-clause. So we can remove this check and set was_rmapped to one always here. Signed-off-by: Joerg Roedel joerg.roe...@amd.com Acked-by: Marcelo Tosatti mtosa...@redhat.com --- arch/x86/kvm/mmu.c |8 ++-- 1 files changed, 2 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index ef060ec..c90b4b2 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -1791,12 +1791,8 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *shadow_pte, pgprintk(hfn old %lx new %lx\n, spte_to_pfn(*shadow_pte), pfn); rmap_remove(vcpu-kvm, shadow_pte); - } else { - if (largepage) - was_rmapped = is_large_pte(*shadow_pte); - else - was_rmapped = 1; - } + } else + was_rmapped = 1; } if (set_spte(vcpu, shadow_pte, pte_access, user_fault, write_fault, dirty, largepage, global, gfn, pfn, speculative, true)) { -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] kvm mmu: alloc shadow pages with __GFP_ZERO
On Wed, Feb 18, 2009 at 02:54:37PM +0100, Joerg Roedel wrote: Adding __GFP_ZERO here will cause us to clear the page twice, which is wasteful. The assertion which the attached patch removes fails sometimes. Removing this assertion is the alternative solution to this problem ;-) From: Joerg Roedel joerg.roe...@amd.com Date: Wed, 18 Feb 2009 14:51:13 +0100 Subject: [PATCH] kvm mmu: remove assertion in kvm_mmu_alloc_page Signed-off-by: Joerg Roedel joerg.roe...@amd.com Acked-by: Marcelo Tosatti mtosa...@redhat.com --- arch/x86/kvm/mmu.c |1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index d93ecec..b226973 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -802,7 +802,6 @@ static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu, set_page_private(virt_to_page(sp-spt), (unsigned long)sp); list_add(sp-link, vcpu-kvm-arch.active_mmu_pages); INIT_LIST_HEAD(sp-oos_link); - ASSERT(is_empty_shadow_page(sp-spt)); bitmap_zero(sp-slot_bitmap, KVM_MEMORY_SLOTS + KVM_PRIVATE_MEM_SLOTS); sp-multimapped = 0; sp-parent_pte = parent_pte; -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Houston, we have May 15, 1953 (says guest when host uses cpufreq, and dies)
On Wed, Feb 18, 2009 at 03:51:22PM +0100, Tomasz Chmielewski wrote: Is using cpufreq (i.e. with ondemand governor) on KVM host safe for guests? I enabled cpufreq on the host, it scaled down the host CPU (Dual-Core AMD Opteron(tm) Processor 2212) to 1 GHz from 2 GHz. Guest (using 1 CPU) was still showing that it has a 2 GHz CPU in /proc/cpuinfo (I guess this value is read only once, when booting). After about 2 hours I started date on the guest - it showed that it's year *1953*, after which I couldn't start any other command (the guest was technically alive - SSH connection to it didn't die - but I couldn't do anything). # date Wed Feb 18 13:07:17 CET 2009 [let's wait ~2 hours] # date Fri May 15 10:13:14 CET 1953 # date ^C^Z [could not interrupt] Is it expected behaviour? Is it correct behaviour? Whats the output of /proc/cpuinfo on the host? Does it contain the constant_tsc flag? Whats the output of /sys/devices/system/clocksource/clocksource0/current_clocksource on the guest? -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Houston, we have May 15, 1953 (says guest when host uses cpufreq, and dies)
Marcelo Tosatti schrieb: On Wed, Feb 18, 2009 at 03:51:22PM +0100, Tomasz Chmielewski wrote: Is using cpufreq (i.e. with ondemand governor) on KVM host safe for guests? I enabled cpufreq on the host, it scaled down the host CPU (Dual-Core AMD Opteron(tm) Processor 2212) to 1 GHz from 2 GHz. Guest (using 1 CPU) was still showing that it has a 2 GHz CPU in /proc/cpuinfo (I guess this value is read only once, when booting). After about 2 hours I started date on the guest - it showed that it's year *1953*, after which I couldn't start any other command (the guest was technically alive - SSH connection to it didn't die - but I couldn't do anything). # date Wed Feb 18 13:07:17 CET 2009 [let's wait ~2 hours] # date Fri May 15 10:13:14 CET 1953 # date ^C^Z [could not interrupt] Is it expected behaviour? Is it correct behaviour? Whats the output of /proc/cpuinfo on the host? Does it contain the constant_tsc flag? It doesn't contain this flag. /proc/cpuinfo output - below. Whats the output of /sys/devices/system/clocksource/clocksource0/current_clocksource on the guest? # cat /sys/devices/system/clocksource/clocksource0/* hpet acpi_pm jiffies tsc - available hpet - current # cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 65 model name : Dual-Core AMD Opteron(tm) Processor 2212 stepping: 2 cpu MHz : 2000.000 cache size : 1024 KB physical id : 0 siblings: 2 core id : 0 cpu cores : 2 fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow rep_good pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy bogomips: 3993.20 TLB size: 1024 4K pages clflush size: 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts fid vid ttp tm stc processor : 1 vendor_id : AuthenticAMD cpu family : 15 model : 65 model name : Dual-Core AMD Opteron(tm) Processor 2212 stepping: 2 cpu MHz : 2000.000 cache size : 1024 KB physical id : 0 siblings: 2 core id : 1 cpu cores : 2 fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow rep_good pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy bogomips: 3993.20 TLB size: 1024 4K pages clflush size: 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual
Re: Houston, we have May 15, 1953 (says guest when host uses cpufreq, and dies)
On Wed, Feb 18, 2009 at 07:53:11PM +0100, Tomasz Chmielewski wrote: processor : 2 vendor_id : AuthenticAMD cpu family : 15 model : 65 model name : Dual-Core AMD Opteron(tm) Processor 2212 stepping: 2 cpu MHz : 2000.000cache size : 1024 KB physical id : 1 siblings: 2 core id : 0 cpu cores : 2 fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow rep_good pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy bogomips: 3993.20 TLB size: 1024 4K pages clflush size: 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts fid vid ttp tm stc kvm-84 as mentioned. Sorry. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Houston, we have May 15, 1953 (says guest when host uses cpufreq, and dies)
Marcelo Tosatti schrieb: flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow rep_good pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy bogomips: 3993.20 TLB size: 1024 4K pages clflush size: 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts fid vid ttp tm stc kvm-84 as mentioned. Sorry. It's OK as long as it will work. - will Windows guests work? - what CPU frequency will the guests show? Current host frequency? Host frequency from the moment the guest booted (i.e. right now the guest will show 1GHz even if the host is running at 2GHz, or the way around)? -- Tomasz Chmielewski http://wpkg.org -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Houston, we have May 15, 1953 (says guest when host uses cpufreq, and dies)
On Wed, Feb 18, 2009 at 08:07:48PM +0100, Tomasz Chmielewski wrote: Marcelo Tosatti schrieb: flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow rep_good pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy bogomips: 3993.20 TLB size: 1024 4K pages clflush size: 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts fid vid ttp tm stc kvm-84 as mentioned. Sorry. It's OK as long as it will work. - will Windows guests work? They should. If they don't, please report. - what CPU frequency will the guests show? Current host frequency? Host frequency from the moment the guest booted (i.e. right now the guest will show 1GHz even if the host is running at 2GHz, or the way around)? Host frequency from the moment the guest booted, since the guest does not receive frequency change notifications. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Houston, we have May 15, 1953 (says guest when host uses cpufreq, and dies)
Marcelo Tosatti schrieb: - what CPU frequency will the guests show? Current host frequency? Host frequency from the moment the guest booted (i.e. right now the guest will show 1GHz even if the host is running at 2GHz, or the way around)? Host frequency from the moment the guest booted, since the guest does not receive frequency change notifications. Is it possible (or is it planned) to pass frequency to the guest (the one which is displayed in /proc/cpuinfo)? Someone may feel disappointed to see his/her brand new virtual guest has a CPU with so few MHz advertised in /proc/cpuinfo. -- Tomasz Chmielewski http://wpkg.org -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Houston, we have May 15, 1953 (says guest when host uses cpufreq, and dies)
On Wed, Feb 18, 2009 at 08:18:50PM +0100, Tomasz Chmielewski wrote: Marcelo Tosatti schrieb: - what CPU frequency will the guests show? Current host frequency? Host frequency from the moment the guest booted (i.e. right now the guest will show 1GHz even if the host is running at 2GHz, or the way around)? Host frequency from the moment the guest booted, since the guest does not receive frequency change notifications. Is it possible (or is it planned) to pass frequency to the guest (the one which is displayed in /proc/cpuinfo)? Possible, not planned AFAIK. Someone may feel disappointed to see his/her brand new virtual guest has a CPU with so few MHz advertised in /proc/cpuinfo. Thats a point. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Houston, we have May 15, 1953 (says guest when host uses cpufreq, and dies)
Marcelo Tosatti schrieb: On Wed, Feb 18, 2009 at 08:18:50PM +0100, Tomasz Chmielewski wrote: Marcelo Tosatti schrieb: - what CPU frequency will the guests show? Current host frequency? Host frequency from the moment the guest booted (i.e. right now the guest will show 1GHz even if the host is running at 2GHz, or the way around)? Host frequency from the moment the guest booted, since the guest does not receive frequency change notifications. Is it possible (or is it planned) to pass frequency to the guest (the one which is displayed in /proc/cpuinfo)? Possible, not planned AFAIK. Possible, right now? How? -- Tomasz Chmielewski http://wpkg.org -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Houston, we have May 15, 1953 (says guest when host uses cpufreq, and dies)
On Wed, Feb 18, 2009 at 09:02:31PM +0100, Tomasz Chmielewski wrote: Marcelo Tosatti schrieb: On Wed, Feb 18, 2009 at 08:18:50PM +0100, Tomasz Chmielewski wrote: Marcelo Tosatti schrieb: - what CPU frequency will the guests show? Current host frequency? Host frequency from the moment the guest booted (i.e. right now the guest will show 1GHz even if the host is running at 2GHz, or the way around)? Host frequency from the moment the guest booted, since the guest does not receive frequency change notifications. Is it possible (or is it planned) to pass frequency to the guest (the one which is displayed in /proc/cpuinfo)? Possible, not planned AFAIK. Possible, right now? How? Write a paravirt notification scheme. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Houston, we have May 15, 1953 (says guest when host uses cpufreq, and dies)
Avi Kivity wrote: Tomasz Chmielewski wrote: Looks I'm a bad, bad, anti-environment CO2 contributor then. From a technical perspective, what are the problems with my CPU that it scales down on the host just fine, but makes the guests return to the past? What kvm version are you using? kvm-84 should fix this. Are you suggesting that one should use cpufreq on a CPU without a constant tsc? Isn't this just asking for trouble? Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[ kvm-Bugs-2556746 ] FreeBSD/PC-BSD text screen corruption
Bugs item #2556746, was opened at 2009-02-02 13:19 Message generated for change (Comment added) made by aurel32 You can respond by visiting: https://sourceforge.net/tracker/?func=detailatid=893831aid=2556746group_id=180599 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: intel Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Tim Knowles (knowlet) Assigned to: Nobody/Anonymous (nobody) Summary: FreeBSD/PC-BSD text screen corruption Initial Comment: Using either kvm-83, kvm-82 or kvm-81 I am unable to install FreeBSD or PC BSD due to screen corruption (screenshot attached). The initial boot menu is shown and is legible. Once you have selected the boot option the boot process continues the screen becomes corrupted. I initially discovered the problem when setting up an LVM backed guest in virt-manager but I have attached a minimal cmd line below that allows you to trigger it. 1) It would appear that this problem was introduced in kvm-81 (kvm-80 does not exhibit the problem with FBSD or PCBSD but I have not tested any other versions of kvm) 2) If I use the -no-kvm switch with KVM-83 this problem does not occur. Details: Host: 1 x Intel Core i7 920, Fedora 10 64bit. 6GB memory (Dell Studio XPS 435) kvm-83: self compiled - gcc version 4.3.2 20081105 (Red Hat 4.3.2-7) cmd line: /usr/local/bin/qemu-system-x86_64 -m 512 -cdrom 7.1-RELEASE-amd64-dvd1.iso Guests: FreeBSD 7,1 PC-BSD 7.0.2 PS: I'd also like to add my thanks for creating KVM, it's fabulous tool. Many thanks -- Comment By: Aurelien Jarno (aurel32) Date: 2009-02-18 22:38 Message: This is fixed in revision 6628 of QEMU, so probably soon in KVM. Any workaround to this bug as suggested ahead is a bad idea, as the screen is probably not the only affected by this bug. This means that some data can be corrupted. -- Comment By: Radek Hladik (kedarius) Date: 2009-02-04 20:06 Message: Confirming the problem too. kvm-83-2.fc11.x86_64 libvirt-0.6.0-1.fc11.x86_64 virt-manager-0.6.1-1.fc11.x86_64 qemu-0.9.1-12.fc11.x86_64 For the libvirt and virt-manager users, how they can use the workaround mentioned by toxxic: Press 6 in the boot, type set console=comconsole use view-serial consoles and type boot (choose xterm as term type) -- Comment By: Jeff (toxxic) Date: 2009-02-04 08:57 Message: I can confirm this happens, when using VNC for the console. Here's a workaround: Start kvm with a -serial flag. You're going to use it as a serial console. qemu-system-x86_64 -serial telnet::2226,server,nowait -cdrom 7.1-RELEASE-amd64-disc1.iso [...] Then connect to port 2226: telnet localhost 2226 Then when you boot FreeBSD CD, and the (legible) boot loader comes up. choose 6. Escape to loader prompt At the OK prompt, type: set console=comconsole The OK prompt will now appear in your telnet session. Type boot and hit return. Continue with legible FreeBSD install via your telnet session. You may want to set up a serial console on the FreeBSD system that you installed, as well. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detailatid=893831aid=2556746group_id=180599 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/02] ia64: Move the macro definitions related to MSI to one header file.
Looks good, should go through Tony's tree I believe. On Wed, Feb 18, 2009 at 10:17:56AM +0800, Zhang, Xiantao wrote: Thanks, Tony! It should not break anything due to no changes about code logic. :) Avi, Could you help to commit the patches with Tony's Ack ? Thanks! Xiantao Luck, Tony wrote: For supporting kvm's MSI, we have to move some macros from ia64_msi.c out to avoide duplicate them. In addition, to keep them consistent with x86's , I also changed some macros' name. How do you think of the patch ? If you agree to the changes, could you add your Sign-off-by to the patch, and Avi may check-in it to kvm.git first to fix an emergent build issue for kvm/ia64. Thanks! Looks OK to me (I didn't test it, or even build it ... so I hope you did!). Acked-by: Tony Luck tony.l...@intel.com -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Recent kvm and vmware server comparisons?
On Wednesday 18 February 2009, Martin Maurer wrote: I suppose no-one has any? VMware includes in its EULA (End User License Agreement) a prohibition for any licensee to publish benchmark results without VMware's approval. (see https://www.vmware.com/tryvmware/eula.php) Maybe this is a reason why all published VMWare benchmarks looks quite similar :-) I would love to see a comparison but due to this restrictions it´s hard to get independent results. Br, Martin I hardly think it stops people from casually talking about their day to day experiences with vmware and how kvm matches up to it. And even if it did, it doesn't sound like something thats actually legally binding. Otherwise I can start putting things like YOU MUST NEVER TALK AGAIN in my eulas. -- Thomas Fjellstrom tfjellst...@shaw.ca -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: copyless virtio net thoughts?
On Wed, Feb 18, 2009 at 10:08:00PM +1030, Rusty Russell wrote: 2) Direct NIC attachment This is particularly interesting with SR-IOV or other multiqueue nics, but for boutique cases or benchmarks, could be for normal NICs. So far I have some very sketched-out patches: for the attached nic dev_alloc_skb() gets an skb from the guest (which supplies them via some kind of AIO interface), and a branch in netif_receive_skb() which returned it to the guest. This bypasses all firewalling in the host though; we're basically having the guest process drive the NIC directly. Hi Rusty, Can I clarify that the idea with utilising SR-IOV would be to assign virtual functions to guests? That is, something conceptually similar to PCI pass-through in Xen (although I'm not sure that anyone has virtual function pass-through working yet). If so, wouldn't this also be useful on machines that have multiple NICs? -- Simon Horman VA Linux Systems Japan K.K., Sydney, Australia Satellite Office H: www.vergenet.net/~horms/ W: www.valinux.co.jp/en -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: copyless virtio net thoughts?
Simon Horman wrote: On Wed, Feb 18, 2009 at 10:08:00PM +1030, Rusty Russell wrote: 2) Direct NIC attachment This is particularly interesting with SR-IOV or other multiqueue nics, but for boutique cases or benchmarks, could be for normal NICs. So far I have some very sketched-out patches: for the attached nic dev_alloc_skb() gets an skb from the guest (which supplies them via some kind of AIO interface), and a branch in netif_receive_skb() which returned it to the guest. This bypasses all firewalling in the host though; we're basically having the guest process drive the NIC directly. Hi Rusty, Can I clarify that the idea with utilising SR-IOV would be to assign virtual functions to guests? That is, something conceptually similar to PCI pass-through in Xen (although I'm not sure that anyone has virtual function pass-through working yet). If so, wouldn't this also be useful on machines that have multiple NICs? Yes, and we have successfully get it run with assigning VF to guest in both Xen KVM, but we are still working on pushing those patches out since it needs Linux PCI subsystem support driver support. Thx, eddie-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Running KVM on a Laptop
On Wed, 2009-02-18 at 08:45 +0100, Louis-David Mitterrand wrote: Is it not as simple as checking for the svm or vt flags? No, one must also check that the bios allows enabling virtualization support. My sony laptop has the right processor but no bios option. Check that first! Louis-David, I just noticed your comment in passing and thought I'd let you (and others with a Sony Vaio) know that it is possible to enable the VT option in NVRAM even though the BIOS set-up menu doesn't support it. I did it with this Vaio VGN-FE41Z with T7200 CPU back in mid 2007 and not had to redo it since. I sometimes run several instances of KVM on it. The Phoenix BIOS does support storing in NVRAM and setting the VT-enable bits using MSR 0x3A at boot time. I was hoping to create a Linux tool to make the NVRAM change but due to: * each BIOS version uses a different token number to store the VT-enable BIOS setting * to identify the token number you have to examine the BIOS executable code * currently the only 'safe' way to set the token in NVRAM is to use the DOS symcmos.exe utility (from Phoenix) A one-shot solution proved impractical so it is a case of doing it on a per-BIOS-version basis. If you want to email me off-list with the precise Sony model-number and BIOS revision I should be able to help you enable the VT bit. For some highly technical background see: http://tjworld.net/wiki/Sony/Vaio/FE41Z/HackingBiosNvram -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 7/8] kvm: qemu: deassign device from guest
Marcelo Tosatti wrote: Weidong, Does this set fix http://sourceforge.net/tracker2/?func=detailaid=2432316group_id=180599atid=893831 I found above bug was already gone even without my patch. I guess it's fixed by Mark: commit: 02874f4272b6787ff94ee7256ef083257b9d1eb1 Author: Mark McLoughlin mar...@redhat.com Date: Fri Nov 28 17:10:47 2008 + kvm: qemu: device-assignment: free device if hotplug fails Signed-off-by: Mark McLoughlin mar...@redhat.com Signed-off-by: Avi Kivity a...@redhat.com Actually, my patch just moves free_assigned_device into init_assigned_device, no functional change. But I updated the patch to also call free_assigned_device when pci_register_device fails in init_assigned_device, because adev is allocated by qemu_mallocz in add_assigned_device. From ce48b0d6c636d8f49bc5977d1d144fa047273846 Mon Sep 17 00:00:00 2001 From: Weidong Han weidong@intel.com Date: Thu, 19 Feb 2009 10:49:30 +0800 Subject: [PATCH] kvm: qemu: free device on error in init_assigned_device make init_assigned_device call free_assigned_device on error, and then make free_assigned_device is static because it's only invoked in device-assigned.c. Acked-by: Mark McLoughlin mar...@redhat.com Signed-off-by: Weidong Han weidong@intel.com --- qemu/hw/device-assignment.c | 14 +- qemu/hw/device-assignment.h |1 - qemu/hw/pci-hotplug.c |1 - 3 files changed, 9 insertions(+), 7 deletions(-) diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c index e6d2352..0b96ee4 100644 --- a/qemu/hw/device-assignment.c +++ b/qemu/hw/device-assignment.c @@ -443,7 +443,7 @@ again: static LIST_HEAD(, AssignedDevInfo) adev_head; -void free_assigned_device(AssignedDevInfo *adev) +static void free_assigned_device(AssignedDevInfo *adev) { AssignedDevice *dev = adev-assigned_dev; @@ -550,7 +550,7 @@ struct PCIDevice *init_assigned_device(AssignedDevInfo *adev, PCIBus *bus) if (NULL == dev) { fprintf(stderr, %s: Error: Couldn't register real device %s\n, __func__, adev-name); -return NULL; +goto out; } adev-assigned_dev = dev; @@ -558,14 +558,14 @@ struct PCIDevice *init_assigned_device(AssignedDevInfo *adev, PCIBus *bus) if (get_real_device(dev, adev-bus, adev-dev, adev-func)) { fprintf(stderr, %s: Error: Couldn't get real device (%s)!\n, __func__, adev-name); -return NULL; +goto out; } /* handle real device's MMIO/PIO BARs */ if (assigned_dev_register_regions(dev-real_device.regions, dev-real_device.region_number, dev)) -return NULL; +goto out; /* handle interrupt routing */ e_device = (dev-dev.devfn 3) 0x1f; @@ -595,10 +595,14 @@ struct PCIDevice *init_assigned_device(AssignedDevInfo *adev, PCIBus *bus) if (r 0) { fprintf(stderr, Failed to assign device \%s\ : %s\n, adev-name, strerror(-r)); - return NULL; + goto out; } return dev-dev; + +out: +free_assigned_device(adev); +return NULL; } /* diff --git a/qemu/hw/device-assignment.h b/qemu/hw/device-assignment.h index f216bb0..6a9b9fa 100644 --- a/qemu/hw/device-assignment.h +++ b/qemu/hw/device-assignment.h @@ -94,7 +94,6 @@ struct AssignedDevInfo { int disable_iommu; }; -void free_assigned_device(AssignedDevInfo *adev); PCIDevice *init_assigned_device(AssignedDevInfo *adev, PCIBus *bus); AssignedDevInfo *add_assigned_device(const char *arg); void add_assigned_devices(PCIBus *bus, const char **devices, int n_devices); diff --git a/qemu/hw/pci-hotplug.c b/qemu/hw/pci-hotplug.c index 8c76453..65fafd1 100644 --- a/qemu/hw/pci-hotplug.c +++ b/qemu/hw/pci-hotplug.c @@ -143,7 +143,6 @@ static PCIDevice *qemu_pci_hot_assign_device(PCIBus *pci_bus, const char *opts) ret = init_assigned_device(adev, pci_bus); if (ret == NULL) { term_printf(Failed to assign device\n); -free_assigned_device(adev); return NULL; } -- 1.6.0.4 On Wed, Feb 18, 2009 at 03:13:05PM +0800, Han, Weidong wrote: free_assigned_device just frees device from qemu, it should also deassign the device from guest when guest exits or hot remove assigned device. Acked-by: Mark McLoughlin mar...@redhat.com Signed-off-by: Weidong Han weidong@intel.com --- qemu/hw/device-assignment.c | 28 ++-- qemu/hw/device-assignment.h |1 + 2 files changed, 27 insertions(+), 2 deletions(-) 0003-kvm-qemu-free-device-on-error-in-init_assigned_dev-v2.patch Description: 0003-kvm-qemu-free-device-on-error-in-init_assigned_dev-v2.patch
Re: With -vnc option, can I still use ctrl+alt + n?
Neo Jia schrieb: hi, I am trying kvm-84 and with -vnc option I can't use ctrl + alt + n key to get the qemu system console. Is there anyway to make this work? Use Qemu/KVM monitor and it's sendkey function. For example: sendkey alt-f3 -- Tomasz Chmielewski http://wpkg.org -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kvm on G4 processors?
On Feb 18, 2009, at 3:21 AM, Alexander Graf wrote: On 17.02.2009, at 09:32, Liu Yu-B13201 yu@freescale.com wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Roberto Innocenti Sent: Tuesday, February 17, 2009 4:26 PM To: kvm-ppc@vger.kernel.org Subject: kvm on G4 processors? I have tried to compile kernel 2.6.27 with kvm support on my PowerBook G4, but kvm option is not visible becouse kernel menu config permit to compile kvm kernel module only if you ave a PowerPC 440 architecture and not G4. But really kvm doesn't work for G4 processors, it's so different the architeture ? In case kvm is working how to compile the kernel module on my G4? I'm afraid that KVM now doesnot support G4. 440 belongs to BOOKE architecture, which is much different from G4. We are at the begonning of porting kvm to 970(fx) atm, which is a lot closer to a g4 than any booke. Alex, which deployment of the 970 are you targeting: 1) IBM JS21/22 blades, that actually have a hypervisor already present 2) Apple G5, Bare metal, but has most hypervisor features physically disabled 3) Any non-Book3E, which we call classic like 604, 750... and/or Book3S, G3, G4, G5, P3, P4 If you choose (1), then your work would be harder but it should apply to any IBM PPC64 or pSeries product If you choose (2), then your work could be much easier, but it would apply to G5s only. if you choose (3), then its about the same as (2). Another question is, when you do create your virtual machine, do you intend for it to look exactly like a G5 machine (and support an unmodified MacOS), a pSeries Machine (and emulate the pSeries Hypervisor), or some new Machine that will require further modifications to the OSes you will support? BTW: I do not intend to discourage, and would be thrilled to see _any_ of the above explored. -JX For now, there's no usable code yet though. Alex -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kvm on G4 processors?
On 18.02.2009, at 13:19, Jimi Xenidis wrote: On Feb 18, 2009, at 3:21 AM, Alexander Graf wrote: On 17.02.2009, at 09:32, Liu Yu-B13201 yu@freescale.com wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Roberto Innocenti Sent: Tuesday, February 17, 2009 4:26 PM To: kvm-ppc@vger.kernel.org Subject: kvm on G4 processors? I have tried to compile kernel 2.6.27 with kvm support on my PowerBook G4, but kvm option is not visible becouse kernel menu config permit to compile kvm kernel module only if you ave a PowerPC 440 architecture and not G4. But really kvm doesn't work for G4 processors, it's so different the architeture ? In case kvm is working how to compile the kernel module on my G4? I'm afraid that KVM now doesnot support G4. 440 belongs to BOOKE architecture, which is much different from G4. We are at the begonning of porting kvm to 970(fx) atm, which is a lot closer to a g4 than any booke. Alex, which deployment of the 970 are you targeting: 1) IBM JS21/22 blades, that actually have a hypervisor already present 2) Apple G5, Bare metal, but has most hypervisor features physically disabled 3) Any non-Book3E, which we call classic like 604, 750... and/or Book3S, G3, G4, G5, P3, P4 If you choose (1), then your work would be harder but it should apply to any IBM PPC64 or pSeries product If you choose (2), then your work could be much easier, but it would apply to G5s only. if you choose (3), then its about the same as (2). Right now we're targeting the PS3, as that's the platform we have most free machines of here ;-). But the code as is should work for any bare metal 970. I haven't really looked into the hypervisor bits yet, but targeting iSeries is definitely on the list. AFAIK we only need to take a deeper look at that when we get to implement the MMU bits. Another question is, when you do create your virtual machine, do you intend for it to look exactly like a G5 machine (and support an unmodified MacOS), a pSeries Machine (and emulate the pSeries Hypervisor), or some new Machine that will require further modifications to the OSes you will support? I thought pSeries were the ones without Hypervisor? Basically the idea is to expose a random bare-metal CPU to the userspace, with qemu implementing the rest. One thing I was thinking of was even to go as far as implementing a G3 guest on a POWER4+ host, but for now the plan is 970 on 970. Alex BTW: I do not intend to discourage, and would be thrilled to see _any_ of the above explored. -JX -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kvm on G4 processors?
On 18.02.2009, at 14:10, Jimi Xenidis wrote: On Feb 18, 2009, at 6:53 AM, Alexander Graf wrote: On 18.02.2009, at 13:19, Jimi Xenidis wrote: On Feb 18, 2009, at 3:21 AM, Alexander Graf wrote: On 17.02.2009, at 09:32, Liu Yu-B13201 yu@freescale.com wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Roberto Innocenti Sent: Tuesday, February 17, 2009 4:26 PM To: kvm-ppc@vger.kernel.org Subject: kvm on G4 processors? I have tried to compile kernel 2.6.27 with kvm support on my PowerBook G4, but kvm option is not visible becouse kernel menu config permit to compile kvm kernel module only if you ave a PowerPC 440 architecture and not G4. But really kvm doesn't work for G4 processors, it's so different the architeture ? In case kvm is working how to compile the kernel module on my G4? I'm afraid that KVM now doesnot support G4. 440 belongs to BOOKE architecture, which is much different from G4. We are at the begonning of porting kvm to 970(fx) atm, which is a lot closer to a g4 than any booke. Alex, which deployment of the 970 are you targeting: 1) IBM JS21/22 blades, that actually have a hypervisor already present 2) Apple G5, Bare metal, but has most hypervisor features physically disabled 3) Any non-Book3E, which we call classic like 604, 750... and/or Book3S, G3, G4, G5, P3, P4 If you choose (1), then your work would be harder but it should apply to any IBM PPC64 or pSeries product If you choose (2), then your work could be much easier, but it would apply to G5s only. if you choose (3), then its about the same as (2). Right now we're targeting the PS3, as that's the platform we have most free machines of here ;-). Do you mean Cell blade, or a PS3? Currently PS3. Though I did test stuff on a 970 PowerStation and a QS22 in parallel. But the code as is should work for any bare metal 970. PS3s come with Sony's Hypervisor which is different the the pSeries Hypervisor and far different from a bare metal 970, which only apple G5s qualify for that name. If your intention is to work entirely above the PPC abstracted Linux environment then that should be interesting. I don't really see how we need to work around anything. Basically the guest in these hypervisors still sees things as if they were bare metal, no? I haven't really looked into the hypervisor bits yet, but targeting iSeries is definitely on the list. This has little to do with iSeries LPAR and to do with the Hypervisor introduced to all pSeries product on IBM 970 and P5 and beyond. Hm - no idea on that one. I haven't really looked into all possible combinations yet. But so far our code doesn't do too much different from a real OS supervisor-unprivileged context switch. AFAIK we only need to take a deeper look at that when we get to implement the MMU bits. I expect exception handlers to be your firs big worry. Yes. We're at that right now. Actually hijacking the host's handlers does work for most cases already, jumping into the guest worked too and jumping out is what we're at atm. Alex The MMU will _indeed_ be interesting. Another question is, when you do create your virtual machine, do you intend for it to look exactly like a G5 machine (and support an unmodified MacOS), a pSeries Machine (and emulate the pSeries Hypervisor), or some new Machine that will require further modifications to the OSes you will support? I thought pSeries were the ones without Hypervisor? As of 970 and P5, _everything_ produces has a hypervisor present regardless if it supports multiple LPARs or not. This is also the case with Sony's PS/3. Basically the idea is to expose a random bare-metal CPU to the userspace, with qemu implementing the rest. One thing I was thinking of was even to go as far as implementing a G3 guest on a POWER4+ host, but for now the plan is 970 on 970. 970 on 970 should work nicely and if you restrict yourself to the bsic architecture then what you do should work well on anything. -JX Alex BTW: I do not intend to discourage, and would be thrilled to see _any_ of the above explored. -JX -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html