[KVM-AUTOTEST][COMMIT] Merge branch 'master' of git://github.com/ehabkost/autotest
From: Uri Lublin u...@redhat.com -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] Define CONFIG_KVM_APIC_ARCHITECTURE
From: Avi Kivity a...@redhat.com Signed-off-by: Avi Kivity a...@redhat.com diff --git a/ia64/external-module-compat.h b/ia64/external-module-compat.h index 8ccad90..60a83a1 100644 --- a/ia64/external-module-compat.h +++ b/ia64/external-module-compat.h @@ -24,6 +24,10 @@ typedef u64 phys_addr_t; #error KVM/IA-64 depends on preempt notifiers in kernel. #endif +#ifndef CONFIG_KVM_APIC_ARCHITECTURE +#define CONFIG_KVM_APIC_ARCHITECTURE +#endif + /* smp_call_function() lost an argument in 2.6.27. */ #if LINUX_VERSION_CODE KERNEL_VERSION(2,6,27) diff --git a/x86/external-module-compat.h b/x86/external-module-compat.h index 273bfee..f7aa151 100644 --- a/x86/external-module-compat.h +++ b/x86/external-module-compat.h @@ -22,6 +22,10 @@ typedef u64 phys_addr_t; #define CONFIG_HAVE_KVM_EVENTFD 1 #endif +#ifndef CONFIG_KVM_APIC_ARCHITECTURE +#define CONFIG_KVM_APIC_ARCHITECTURE +#endif + #if LINUX_VERSION_CODE KERNEL_VERSION(2,6,25) #ifdef CONFIG_X86_64 -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] KVM: Use pointer to vcpu instead of vcpu_id in timer code.
From: Gleb Natapov g...@redhat.com Signed-off-by: Gleb Natapov g...@redhat.com Signed-off-by: Avi Kivity a...@redhat.com diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c index 5d5cfd3..26c29cb 100644 --- a/arch/x86/kvm/i8254.c +++ b/arch/x86/kvm/i8254.c @@ -291,7 +291,7 @@ static void create_pit_timer(struct kvm_kpit_state *ps, u32 val, int is_period) pt-timer.function = kvm_timer_fn; pt-t_ops = kpit_ops; pt-kvm = ps-pit-kvm; - pt-vcpu_id = 0; + pt-vcpu = pt-kvm-bsp_vcpu; atomic_set(pt-pending, 0); ps-irq_ack = 1; diff --git a/arch/x86/kvm/kvm_timer.h b/arch/x86/kvm/kvm_timer.h index 26bd6ba..55c7524 100644 --- a/arch/x86/kvm/kvm_timer.h +++ b/arch/x86/kvm/kvm_timer.h @@ -6,7 +6,7 @@ struct kvm_timer { bool reinject; struct kvm_timer_ops *t_ops; struct kvm *kvm; - int vcpu_id; + struct kvm_vcpu *vcpu; }; struct kvm_timer_ops { diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index b066130..b1694dc 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -950,7 +950,7 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu) apic-lapic_timer.timer.function = kvm_timer_fn; apic-lapic_timer.t_ops = lapic_timer_ops; apic-lapic_timer.kvm = vcpu-kvm; - apic-lapic_timer.vcpu_id = vcpu-vcpu_id; + apic-lapic_timer.vcpu = vcpu; apic-base_address = APIC_DEFAULT_PHYS_BASE; vcpu-arch.apic_base = APIC_DEFAULT_PHYS_BASE; diff --git a/arch/x86/kvm/timer.c b/arch/x86/kvm/timer.c index 86dbac0..85cc743 100644 --- a/arch/x86/kvm/timer.c +++ b/arch/x86/kvm/timer.c @@ -33,7 +33,7 @@ enum hrtimer_restart kvm_timer_fn(struct hrtimer *data) struct kvm_vcpu *vcpu; struct kvm_timer *ktimer = container_of(data, struct kvm_timer, timer); - vcpu = ktimer-kvm-vcpus[ktimer-vcpu_id]; + vcpu = ktimer-vcpu; if (!vcpu) return HRTIMER_NORESTART; -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
From: Avi Kivity a...@redhat.com Signed-off-by: Avi Kivity a...@redhat.com -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] KVM: Introduce kvm_vcpu_is_bsp() function.
From: Gleb Natapov g...@redhat.com Use it instead of open code vcpu_id zero is BSP assumption. Signed-off-by: Gleb Natapov g...@redhat.com Signed-off-by: Avi Kivity a...@redhat.com diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c index 3199221..3924591 100644 --- a/arch/ia64/kvm/kvm-ia64.c +++ b/arch/ia64/kvm/kvm-ia64.c @@ -1216,7 +1216,7 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu) if (IS_ERR(vmm_vcpu)) return PTR_ERR(vmm_vcpu); - if (vcpu-vcpu_id == 0) { + if (kvm_vcpu_is_bsp(vcpu)) { vcpu-arch.mp_state = KVM_MP_STATE_RUNNABLE; /*Set entry address for first run.*/ diff --git a/arch/ia64/kvm/vcpu.c b/arch/ia64/kvm/vcpu.c index a2c6c15..7e7391d 100644 --- a/arch/ia64/kvm/vcpu.c +++ b/arch/ia64/kvm/vcpu.c @@ -830,7 +830,7 @@ static void vcpu_set_itc(struct kvm_vcpu *vcpu, u64 val) kvm = (struct kvm *)KVM_VM_BASE; - if (vcpu-vcpu_id == 0) { + if (kvm_vcpu_is_bsp(vcpu)) { for (i = 0; i kvm-arch.online_vcpus; i++) { v = (struct kvm_vcpu *)((char *)vcpu + sizeof(struct kvm_vcpu_data) * i); diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c index 9749ec3..5d5cfd3 100644 --- a/arch/x86/kvm/i8254.c +++ b/arch/x86/kvm/i8254.c @@ -228,7 +228,7 @@ int pit_has_pending_timer(struct kvm_vcpu *vcpu) { struct kvm_pit *pit = vcpu-kvm-arch.vpit; - if (pit vcpu-vcpu_id == 0 pit-pit_state.irq_ack) + if (pit kvm_vcpu_is_bsp(vcpu) pit-pit_state.irq_ack) return atomic_read(pit-pit_state.pit_timer.pending); return 0; } @@ -249,7 +249,7 @@ void __kvm_migrate_pit_timer(struct kvm_vcpu *vcpu) struct kvm_pit *pit = vcpu-kvm-arch.vpit; struct hrtimer *timer; - if (vcpu-vcpu_id != 0 || !pit) + if (!kvm_vcpu_is_bsp(vcpu) || !pit) return; timer = pit-pit_state.pit_timer.timer; diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c index bf94a45..148c52a 100644 --- a/arch/x86/kvm/i8259.c +++ b/arch/x86/kvm/i8259.c @@ -57,7 +57,7 @@ static void pic_unlock(struct kvm_pic *s) } if (wakeup) { - vcpu = s-kvm-vcpus[0]; + vcpu = s-kvm-bsp_vcpu; if (vcpu) kvm_vcpu_kick(vcpu); } @@ -254,7 +254,7 @@ void kvm_pic_reset(struct kvm_kpic_state *s) { int irq, irqbase, n; struct kvm *kvm = s-pics_state-irq_request_opaque; - struct kvm_vcpu *vcpu0 = kvm-vcpus[0]; + struct kvm_vcpu *vcpu0 = kvm-bsp_vcpu; if (s == s-pics_state-pics[0]) irqbase = 0; @@ -512,7 +512,7 @@ static void picdev_read(struct kvm_io_device *this, static void pic_irq_request(void *opaque, int level) { struct kvm *kvm = opaque; - struct kvm_vcpu *vcpu = kvm-vcpus[0]; + struct kvm_vcpu *vcpu = kvm-bsp_vcpu; struct kvm_pic *s = pic_irqchip(kvm); int irq = pic_get_irq(s-pics[0]); diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 44f20cd..b066130 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -793,7 +793,8 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value) vcpu-arch.apic_base = value; return; } - if (apic-vcpu-vcpu_id) + + if (!kvm_vcpu_is_bsp(apic-vcpu)) value = ~MSR_IA32_APICBASE_BSP; vcpu-arch.apic_base = value; @@ -844,7 +845,7 @@ void kvm_lapic_reset(struct kvm_vcpu *vcpu) } update_divide_count(apic); atomic_set(apic-lapic_timer.pending, 0); - if (vcpu-vcpu_id == 0) + if (kvm_vcpu_is_bsp(vcpu)) vcpu-arch.apic_base |= MSR_IA32_APICBASE_BSP; apic_update_ppr(apic); @@ -985,7 +986,7 @@ int kvm_apic_accept_pic_intr(struct kvm_vcpu *vcpu) u32 lvt0 = apic_get_reg(vcpu-arch.apic, APIC_LVT0); int r = 0; - if (vcpu-vcpu_id == 0) { + if (kvm_vcpu_is_bsp(vcpu)) { if (!apic_hw_enabled(vcpu-arch.apic)) r = 1; if ((lvt0 APIC_LVT_MASKED) == 0 diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 37397f6..13f6f7d 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -645,7 +645,7 @@ static int svm_vcpu_reset(struct kvm_vcpu *vcpu) init_vmcb(svm); - if (vcpu-vcpu_id != 0) { + if (!kvm_vcpu_is_bsp(vcpu)) { kvm_rip_write(vcpu, 0); svm-vmcb-save.cs.base = svm-vcpu.arch.sipi_vector 12; svm-vmcb-save.cs.selector = svm-vcpu.arch.sipi_vector 8; @@ -709,7 +709,7 @@ static struct kvm_vcpu *svm_create_vcpu(struct kvm *kvm, unsigned int id) fx_init(svm-vcpu); svm-vcpu.fpu_active = 1; svm-vcpu.arch.apic_base = 0xfee0 | MSR_IA32_APICBASE_ENABLE; - if (svm-vcpu.vcpu_id == 0) + if (kvm_vcpu_is_bsp(svm-vcpu))
[COMMIT master] KVM: Break dependency between vcpu index in vcpus array and vcpu_id.
From: Gleb Natapov g...@redhat.com Archs are free to use vcpu_id as they see fit. For x86 it is used as vcpu's apic id. New ioctl is added to configure boot vcpu id that was assumed to be 0 till now. Signed-off-by: Gleb Natapov g...@redhat.com Signed-off-by: Avi Kivity a...@redhat.com diff --git a/arch/ia64/include/asm/kvm_host.h b/arch/ia64/include/asm/kvm_host.h index 5f43697..9cf1c4b 100644 --- a/arch/ia64/include/asm/kvm_host.h +++ b/arch/ia64/include/asm/kvm_host.h @@ -465,7 +465,6 @@ struct kvm_arch { unsigned long metaphysical_rr4; unsigned long vmm_init_rr; - int online_vcpus; int is_sn2; struct kvm_ioapic *vioapic; diff --git a/arch/ia64/kvm/Kconfig b/arch/ia64/kvm/Kconfig index f922bbb..cbadd8a 100644 --- a/arch/ia64/kvm/Kconfig +++ b/arch/ia64/kvm/Kconfig @@ -25,6 +25,7 @@ config KVM select PREEMPT_NOTIFIERS select ANON_INODES select HAVE_KVM_IRQCHIP + select KVM_APIC_ARCHITECTURE ---help--- Support hosting fully virtualized guest machines using hardware virtualization extensions. You will need a fairly recent diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c index 3924591..cbda5db 100644 --- a/arch/ia64/kvm/kvm-ia64.c +++ b/arch/ia64/kvm/kvm-ia64.c @@ -338,7 +338,7 @@ static struct kvm_vcpu *lid_to_vcpu(struct kvm *kvm, unsigned long id, union ia64_lid lid; int i; - for (i = 0; i kvm-arch.online_vcpus; i++) { + for (i = 0; i atomic_read(kvm-online_vcpus); i++) { if (kvm-vcpus[i]) { lid.val = VCPU_LID(kvm-vcpus[i]); if (lid.id == id lid.eid == eid) @@ -412,7 +412,7 @@ static int handle_global_purge(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) call_data.ptc_g_data = p-u.ptc_g_data; - for (i = 0; i kvm-arch.online_vcpus; i++) { + for (i = 0; i atomic_read(kvm-online_vcpus); i++) { if (!kvm-vcpus[i] || kvm-vcpus[i]-arch.mp_state == KVM_MP_STATE_UNINITIALIZED || vcpu == kvm-vcpus[i]) @@ -852,8 +852,6 @@ struct kvm *kvm_arch_create_vm(void) kvm_init_vm(kvm); - kvm-arch.online_vcpus = 0; - return kvm; } @@ -1356,8 +1354,6 @@ struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm, goto fail; } - kvm-arch.online_vcpus++; - return vcpu; fail: return ERR_PTR(r); diff --git a/arch/ia64/kvm/vcpu.c b/arch/ia64/kvm/vcpu.c index 7e7391d..2334eac 100644 --- a/arch/ia64/kvm/vcpu.c +++ b/arch/ia64/kvm/vcpu.c @@ -831,7 +831,7 @@ static void vcpu_set_itc(struct kvm_vcpu *vcpu, u64 val) kvm = (struct kvm *)KVM_VM_BASE; if (kvm_vcpu_is_bsp(vcpu)) { - for (i = 0; i kvm-arch.online_vcpus; i++) { + for (i = 0; i atomic_read(kvm-online_vcpus); i++) { v = (struct kvm_vcpu *)((char *)vcpu + sizeof(struct kvm_vcpu_data) * i); VMX(v, itc_offset) = itc_offset; diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index 8cd2a4e..7fbedfd 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -27,6 +27,7 @@ config KVM select ANON_INODES select HAVE_KVM_IRQCHIP select HAVE_KVM_EVENTFD + select KVM_APIC_ARCHITECTURE ---help--- Support hosting fully virtualized guest machines using hardware virtualization extensions. You will need a fairly recent diff --git a/include/linux/kvm.h b/include/linux/kvm.h index 7ed9de1..c7611ef 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -430,6 +430,7 @@ struct kvm_trace_rec { #ifdef __KVM_HAVE_PIT #define KVM_CAP_PIT2 33 #endif +#define KVM_CAP_SET_BOOT_CPU_ID 34 #ifdef KVM_CAP_IRQ_ROUTING @@ -535,6 +536,7 @@ struct kvm_irqfd { #define KVM_DEASSIGN_DEV_IRQ _IOW(KVMIO, 0x75, struct kvm_assigned_irq) #define KVM_IRQFD _IOW(KVMIO, 0x76, struct kvm_irqfd) #define KVM_CREATE_PIT2 _IOW(KVMIO, 0x77, struct kvm_pit_config) +#define KVM_SET_BOOT_CPU_ID_IO(KVMIO, 0x78) /* * ioctls for vcpu fds diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index b55d427..1478b8f 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -129,8 +129,12 @@ struct kvm { int nmemslots; struct kvm_memory_slot memslots[KVM_MEMORY_SLOTS + KVM_PRIVATE_MEM_SLOTS]; +#ifdef CONFIG_KVM_APIC_ARCHITECTURE + u32 bsp_vcpu_id; struct kvm_vcpu *bsp_vcpu; +#endif struct kvm_vcpu *vcpus[KVM_MAX_VCPUS]; + atomic_t online_vcpus; struct list_head vm_list; struct mutex lock; struct kvm_io_bus mmio_bus; @@ -550,8 +554,10 @@ static inline void kvm_irqfd_release(struct kvm *kvm) {}
[COMMIT master] Merge branch 'for-avi' of git://git.et.redhat.com/qemu-net
From: Avi Kivity a...@redhat.com * 'for-avi' of git://git.et.redhat.com/qemu-net: (69 commits) Fix build breakage when using VDE introduced by 4f1c942 Fix defined but not used warning monitor: Introduce get_command_name() monitor: Remove unused variable monitor: Remove uneeded 'return' statement monitor: Remove uneeded goto Use snprintf to avoid OpenBSD warning Fix Sparse warning Clean up generated qemu-img-cmds.h Fix Sparse warning microblaze-dis.c does not need to be executable Fix warning Remove unused and misnamed field and variable Update irqs on reset and device load Register reset functions for e1000 and rtl8139 virtio-net: Increase filter and control limits virtio-net: Add new RX filter controls virtio-net: MAC filter optimization virtio-net: Fix MAC filter overflow handling virtio-net: reorganize receive_filter() ... -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] Pull qemu headers into libkvm
From: Glauber Costa glom...@redhat.com Those headers define qemu specific things like ram_addr_t. This will allow us to start using them in libkvm. Signed-off-by: Glauber Costa glom...@redhat.com Signed-off-by: Avi Kivity a...@redhat.com diff --git a/libkvm-all.c b/libkvm-all.c index dd56498..45679fb 100644 --- a/libkvm-all.c +++ b/libkvm-all.c @@ -26,6 +26,7 @@ #error libkvm: userspace and kernel version mismatch #endif +#include sysemu.h #include unistd.h #include fcntl.h #include stdio.h @@ -47,7 +48,6 @@ #define DPRINTF(fmt, args...) do {} while (0) #endif -#define MIN(x,y) ((x) (y) ? (x) : (y)) #define ALIGN(x, y) (((x)+(y)-1) ~((y)-1)) int kvm_abi = EXPECTED_KVM_API_VERSION; diff --git a/libkvm-all.h b/libkvm-all.h index 03b98df..d647ef1 100644 --- a/libkvm-all.h +++ b/libkvm-all.h @@ -82,6 +82,7 @@ struct kvm_vcpu_context typedef struct kvm_context *kvm_context_t; typedef struct kvm_vcpu_context *kvm_vcpu_context_t; +#include kvm.h int kvm_alloc_kernel_memory(kvm_context_t kvm, unsigned long memory, void **vm_mem); int kvm_alloc_userspace_memory(kvm_context_t kvm, unsigned long memory, diff --git a/target-i386/libkvm.c b/target-i386/libkvm.c index f88102e..0f4e009 100644 --- a/target-i386/libkvm.c +++ b/target-i386/libkvm.c @@ -1,3 +1,5 @@ +#include sysemu.h + #include libkvm-all.h #include libkvm.h #include errno.h -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Network throughput limits for local VM - VM communication
On Wednesday 10 June 2009, Fischer, Anna wrote: Have you tried eliminating VLAN to simplify the setup? No - but there is a relating bug in the tun/tap interface (well, it is not really a bug but simply the way tun/tap works) that will cause packets to be replicated on all the tap interfaces (across all bridges attached to those) if I do not configure VLANs. This will result in a system that is even more overloaded. I had discovered this a while back when running UDP stress tests under 10G. Not sure I understand. Do you mean you have all three guests connected to the same bridge? If you want the router guest to be the only connection, you should not connect the two bridges anywhere else, so I don't see how packets can go from one bridge to the other one, except through the router. Does it change when the guests communicate over a -net socket interface with your router instead of the -net tap + bridge in the host? I have not tried this - I need the bridge in the network data path for some testing, so using the -net socket interface would not solve my problem. I did not mean this to solve your problem but to hunt down the bug. If the problem only exists with the host bridge device, we should look there, but if it persists, we can probably rule out the tap, bridge and vlan code in the host as the problem source. However, I have just today managed to get around this bug by using the e1000 QEMU emulated NIC model and this seems to do the trick. Now the throughput is still very low, but that might simply be because my system is too weak. When using the e1000 model instead of rtl8139 or virtio, I do not have any network crashes any more. That could either indicate a bug in rtl8139 and virtio, or that the specific timing of the e1000 model hides this bug. What happens if only one side uses e1000 while the other still uses virtio? What about any of the other models? Arnd -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Network throughput limits for local VM - VM communication
Subject: Re: Network throughput limits for local VM - VM communication On Wednesday 10 June 2009, Fischer, Anna wrote: Have you tried eliminating VLAN to simplify the setup? No - but there is a relating bug in the tun/tap interface (well, it is not really a bug but simply the way tun/tap works) that will cause packets to be replicated on all the tap interfaces (across all bridges attached to those) if I do not configure VLANs. This will result in a system that is even more overloaded. I had discovered this a while back when running UDP stress tests under 10G. Not sure I understand. Do you mean you have all three guests connected to the same bridge? If you want the router guest to be the only connection, you should not connect the two bridges anywhere else, so I don't see how packets can go from one bridge to the other one, except through the router. I am using two bridges, and yes, in theory, the router should be the only connection between the two guests. However, without VLANs, the tun interface will pass packets to all tap interfaces. It has to, as it doesn't know to which one the packet has to go to. It does not look at packets, it simply copies buffers from userspace to the tap interface in the kernel. The tap interface then eventually drops the packet, if the MAC address does not match its own. So packets will not actually go across both bridges, because the tap interface that should not receive the packet does drop it. However, it does receive the packet and processes it to some extend which causes some overhead. As I was told by someone at KVM/RedHat, this does not happen when using VLANs as then there will be a direct mapping between any tun-tap device and so no packet replication across multiple tap devices. Does it change when the guests communicate over a -net socket interface with your router instead of the -net tap + bridge in the host? I have not tried this - I need the bridge in the network data path for some testing, so using the -net socket interface would not solve my problem. I did not mean this to solve your problem but to hunt down the bug. If the problem only exists with the host bridge device, we should look there, but if it persists, we can probably rule out the tap, bridge and vlan code in the host as the problem source. Yes, I understand you were trying to help and using the -net socket interface would help to narrow down where the problem could be. I just have not yet managed to set this up, but I might do if I find the time in the next days. I was hoping that other people might have seen the same issues I see, but unfortunately I did not get that many replies/suggestions on this issue from the list at all. However, I have just today managed to get around this bug by using the e1000 QEMU emulated NIC model and this seems to do the trick. Now the throughput is still very low, but that might simply be because my system is too weak. When using the e1000 model instead of rtl8139 or virtio, I do not have any network crashes any more. That could either indicate a bug in rtl8139 and virtio, or that the specific timing of the e1000 model hides this bug. What happens if only one side uses e1000 while the other still uses virtio? What about any of the other models? Good question. I will try this out and post the results. Cheers, Anna -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCHv2] [APIC] Optimize searching for highest IRR
Most of the time IRR is empty, so instead of scanning the whole IRR on each VM entry keep a variable that tells us if IRR is not empty. IRR will have to be scanned twice on each IRQ delivery, but this is much more rare than VM entry. v2: The only difference from v1 is the comment describing possible race and how it is solved. The race is not created by the patch BTW. Signed-off-by: Gleb Natapov g...@redhat.com diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 44f20cd..38a7fa0 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -165,29 +165,45 @@ static int find_highest_vector(void *bitmap) static inline int apic_test_and_set_irr(int vec, struct kvm_lapic *apic) { + apic-irr_pending = true; return apic_test_and_set_vector(vec, apic-regs + APIC_IRR); } -static inline void apic_clear_irr(int vec, struct kvm_lapic *apic) +static inline int apic_search_irr(struct kvm_lapic *apic) { - apic_clear_vector(vec, apic-regs + APIC_IRR); + return find_highest_vector(apic-regs + APIC_IRR); } static inline int apic_find_highest_irr(struct kvm_lapic *apic) { int result; - result = find_highest_vector(apic-regs + APIC_IRR); + if (!apic-irr_pending) + return -1; + + result = apic_search_irr(apic); ASSERT(result == -1 || result = 16); return result; } +static inline void apic_clear_irr(int vec, struct kvm_lapic *apic) +{ + apic-irr_pending = false; + apic_clear_vector(vec, apic-regs + APIC_IRR); + if (apic_search_irr(apic) != -1) + apic-irr_pending = true; +} + int kvm_lapic_find_highest_irr(struct kvm_vcpu *vcpu) { struct kvm_lapic *apic = vcpu-arch.apic; int highest_irr; + /* This may race with setting of irr in __apic_accept_irq() and + value returned may be wrong, but kvm_vcpu_kick() in __apic_accept_irq + will cause vmexit immediately and the value will be recalculated + on the next vmentry. */ if (!apic) return 0; highest_irr = apic_find_highest_irr(apic); @@ -842,6 +858,7 @@ void kvm_lapic_reset(struct kvm_vcpu *vcpu) apic_set_reg(apic, APIC_ISR + 0x10 * i, 0); apic_set_reg(apic, APIC_TMR + 0x10 * i, 0); } + apic-irr_pending = false; update_divide_count(apic); atomic_set(apic-lapic_timer.pending, 0); if (vcpu-vcpu_id == 0) diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h index a587f83..3f3ecc6 100644 --- a/arch/x86/kvm/lapic.h +++ b/arch/x86/kvm/lapic.h @@ -12,6 +12,7 @@ struct kvm_lapic { struct kvm_timer lapic_timer; u32 divide_count; struct kvm_vcpu *vcpu; + bool irr_pending; struct page *regs_page; void *regs; gpa_t vapic_addr; -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[GIT PULL] KVM updates for 2.6.31
Linus, please pull the 2.6.31 KVM batch from git://git.kernel.org/pub/scm/virt/kvm/kvm.git kvm-updates/2.6.31 Changes include MSI support, a rework of the interrupt code, improved smp performance, and architecure code updates. Amit Shah (1): KVM: x86: Ignore reads to EVNTSEL MSRs Andi Kleen (1): KVM: Add VT-x machine check support Andre Przywara (1): KVM: SVM: Fix cross vendor migration issue in segment segment descriptor Avi Kivity (18): KVM: VMX: Don't use highmem pages for the msr and pio bitmaps KVM: VMX: Don't intercept MSR_KERNEL_GS_BASE KVM: VMX: Make module parameters readable KVM: VMX: Rename kvm_handle_exit() to vmx_handle_exit() KVM: VMX: Simplify module parameter names KVM: VMX: Annotate module parameters as __read_mostly KVM: VMX: Zero the vpid module parameter if vpid is not supported KVM: VMX: Zero ept module parameter if ept is not present KVM: VMX: Fold vm_need_ept() into callers KVM: VMX: Make flexpriority module parameter reflect hardware capability KVM: MMU: Use different shadows when EFER.NXE changes KVM: Replace kvmclock open-coded get_cpu_var() with the real thing KVM: Fix cpuid feature misreporting KVM: Add AMD cpuid bit: cr8_legacy, abm, misaligned sse, sse4, 3dnow prefetch x86: Add cpu features MOVBE and POPCNT KVM: Update cpuid 1.ecx reporting KVM: Disable large pages on misaligned memory slots KVM: Prevent overflow in largepages calculation Carsten Otte (4): KVM: s390: Fix memory slot versus run - v3 KVM: s390: Unlink vcpu on destroy - v2 KVM: s390: Sanity check on validity intercept KVM: s390: Verify memory in kvm run Chris Wright (1): KVM: Trivial format fix in setup_routing_entry() Christian Borntraeger (3): KVM: declare ioapic functions only on affected hardware KVM: s390: use hrtimer for clock wakeup from idle - v2 KVM: s390: optimize float int lock: spin_lock_bh -- spin_lock Dong, Eddie (2): KVM: MMU: Emulate #PF error code of reserved bits violation KVM: Use rsvd_bits_mask in load_pdptrs() Eddie Dong (1): KVM: MMU: Fix comment in page_fault() Glauber Costa (3): KVM: fix apic_debug instances KVM: Replace -drop_interrupt_shadow() by -set_interrupt_shadow() KVM: Deal with interrupt shadow state for emulated instructions Gleb Natapov (51): KVM: APIC: kvm_apic_set_irq deliver all kinds of interrupts KVM: ioapic/msi interrupt delivery consolidation KVM: consolidate ioapic/ipi interrupt delivery logic KVM: change the way how lowest priority vcpu is calculated KVM: APIC: get rid of deliver_bitmask KVM: MMU: do not free active mmu pages in free_mmu_pages() KVM: SVM: Remove duplicate code in svm_do_inject_vector() KVM: reuse (pop|push)_irq from svm.c in vmx.c KVM: Timer event should not unconditionally unhalt vcpu. KVM: Fix interrupt unhalting a vcpu when it shouldn't KVM: VMX: Fix handling of a fault during NMI unblocked due to IRET KVM: VMX: Rewrite vmx_complete_interrupt()'s twisted maze of if() statements KVM: VMX: Do not zero idt_vectoring_info in vmx_complete_interrupts(). KVM: Fix task switch back link handling. KVM: Fix unneeded instruction skipping during task switching. KVM: x86 emulator: fix call near emulation KVM: x86 emulator: Add decoding of 16bit second immediate argument KVM: x86 emulator: Add lcall decoding KVM: x86 emulator: Complete ljmp decoding at decode stage KVM: x86 emulator: Complete short/near jcc decoding in decode stage KVM: x86 emulator: Complete decoding of call near in decode stage KVM: x86 emulator: Add unsigned byte immediate decode KVM: x86 emulator: Completely decode in/out at decoding stage KVM: x86 emulator: Decode soft interrupt instructions KVM: x86 emulator: Add new mode of instruction emulation: skip KVM: SVM: Skip instruction on a task switch only when appropriate KVM: Make kvm_cpu_(has|get)_interrupt() work for userspace irqchip too KVM: VMX: Consolidate userspace and kernel interrupt injection for VMX KVM: VMX: Cleanup vmx_intr_assist() KVM: Use kvm_arch_interrupt_allowed() instead of checking interrupt_window_open directly KVM: SVM: Coalesce userspace/kernel irqchip interrupt injection logic KVM: Remove exception_injected() callback. KVM: Remove inject_pending_vectors() callback KVM: Remove kvm_push_irq() KVM: sync_lapic_to_cr8() should always sync cr8 to V_TPR KVM: Do not report TPR write to userspace if new value bigger or equal to a previous one. KVM: Get rid of arch.interrupt_window_open arch.nmi_window_open KVM: SVM: Add NMI injection support KVM: Fix userspace IRQ chip migration KVM: Get rid of get_irq() callback KVM: SVM: Don't reinject event that caused a task
Re: Network throughput limits for local VM - VM communication
Fischer, Anna wrote: I am using two bridges, and yes, in theory, the router should be the only connection between the two guests. However, without VLANs, the tun interface will pass packets to all tap interfaces. It has to, as it doesn't know to which one the packet has to go to. It does not look at packets, it simply copies buffers from userspace to the tap interface in the kernel. The tap interface then eventually drops the packet, if the MAC address does not match its own. So packets will not actually go across both bridges, because the tap interface that should not receive the packet does drop it. However, it does receive the packet and processes it to some extend which causes some overhead. As I was told by someone at KVM/RedHat, this does not happen when using VLANs as then there will be a direct mapping between any tun-tap device and so no packet replication across multiple tap devices. This only happens if the receiving tap never sends out packets. If the tap interface does send out packets, the bridge will associate their MAC address with that interface, and future packets will only be forwarded there. Is this your scenario? -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] kvm: change the dirty page tracking to work with dirty bity
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Izik Eidus wrote: + if (!kvm_x86_ops-dirty_bit_support()) { + spin_lock(kvm-mmu_lock); + /* remove_write_access() flush the tlb */ + kvm_mmu_slot_remove_write_access(kvm, log-slot); + spin_unlock(kvm-mmu_lock); + } else { + kvm_flush_remote_tlbs(kvm); It might not correspond to the common style, but I think a callback function -dirty_bit_support is overkill. This is a function pointer the compiler cannot see through. Hence it's an indirect function call. But the implementation is always a simple yes/no (it seems). Indirect calls are rather expensive (most of the time they cannot be predicted right). Why not instead have a read-only data constants and have an inline function test that value? It means no function call and only one data access. Also, you're inconsistent in the use of integers and true/false in the implementations of this function. Either use 0/1 or false/true. - -- ➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (GNU/Linux) iEYEARECAAYFAkowv08ACgkQ2ijCOnn/RHR71ACdH3xr3XPnCLgsMMwdTawfehEN vs4An2DlErhU6SeanSYVIyP3eLB4sjsz =UZ32 -END PGP SIGNATURE- -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Replace pending exception by PF if it happens serially.
Replace previous exception with a new one in a hope that instruction re-execution will regenerate lost exception. Signed-off-by: Gleb Natapov g...@redhat.com diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 272e2e8..3150d06 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -181,16 +181,21 @@ void kvm_inject_page_fault(struct kvm_vcpu *vcpu, unsigned long addr, ++vcpu-stat.pf_guest; if (vcpu-arch.exception.pending) { - if (vcpu-arch.exception.nr == PF_VECTOR) { - printk(KERN_DEBUG kvm: inject_page_fault: -double fault 0x%lx\n, addr); - vcpu-arch.exception.nr = DF_VECTOR; - vcpu-arch.exception.error_code = 0; - } else if (vcpu-arch.exception.nr == DF_VECTOR) { + switch(vcpu-arch.exception.nr) { + case DF_VECTOR: /* triple fault - shutdown */ set_bit(KVM_REQ_TRIPLE_FAULT, vcpu-requests); + case PF_VECTOR: + vcpu-arch.exception.nr = DF_VECTOR; + vcpu-arch.exception.error_code = 0; + return; + default: + /* replace previous exception with a new one in a hope + that instruction re-execution will regenerate lost + exception */ + vcpu-arch.exception.pending = false; + break; } - return; } vcpu-arch.cr2 = addr; kvm_queue_exception_e(vcpu, PF_VECTOR, error_code); -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM won't compile on 2.6.29
Bike Snow wrote: Hello I've compiled and installed KVM on kernel 2.6.28-11. It worked perfectly. I'm using Ubuntu 9.04. I'm now trying to compile on kernel 2.6.29-4. It fails on compiling the kernel module with this error message: /usr/src/kvm-kmod-devel-86/x86/iommu.c: In function âkvm_iommu_map_pagesâ: /usr/src/kvm-kmod-devel-86/x86/iommu.c:90: error: âIOMMU_CACHEâ undeclared (first use in this function) /usr/src/kvm-kmod-devel-86/x86/iommu.c:90: error: (Each undeclared identifier is reported only once /usr/src/kvm-kmod-devel-86/x86/iommu.c:90: error: for each function it appears in.) /usr/src/kvm-kmod-devel-86/x86/iommu.c: In function âkvm_assign_deviceâ: /usr/src/kvm-kmod-devel-86/x86/iommu.c:155: error: implicit declaration of function âiommu_domain_has_capâ /usr/src/kvm-kmod-devel-86/x86/iommu.c:156: error: âIOMMU_CAP_CACHE_COHERENCYâ undeclared (first use in this function) make[3]: *** [/usr/src/kvm-kmod-devel-86/x86/iommu.o] Error 1 make[2]: *** [/usr/src/kvm-kmod-devel-86/x86] Error 2 make[1]: *** [_module_/usr/src/kvm-kmod-devel-86] Error 2 make[1]: Leaving directory `/usr/src/linux-headers-2.6.29-02062904-generic' make: *** [all] Error 2 This happens if I compile with the kvm-86.tar.gz package or the smaller module only kvm-kmod-devel-86.tar.gz package. I have kernel headers installed. I've also installed the source for 2.6.29-4 (not necessary but tried it anyway). Any ideas? This is already fixed in git. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH 05/11] qemu: MSI-X support functions
On Wed, Jun 10, 2009 at 08:04:13PM +0100, Paul Brook wrote: If we can't start a new qemu with the same hardware configuration then we should not be allowing migration or loading of snapshots. OK, so I'll add an option in virtio-net to disable msi-x, and such an option will be added in any device with msi-x support. Will that address your concern? Yes, as long as migration fails when you try to migrate to the wrong kind of device. Paul I think the right way to do this, is to make sure that standard read-only registers in PCI config space are not modified in migration (device-specific registers could have changed as a result of guest actions, so we can't make assumptions). -- MST -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Network throughput limits for local VM - VM communication
Subject: Re: Network throughput limits for local VM - VM communication Fischer, Anna wrote: I am using two bridges, and yes, in theory, the router should be the only connection between the two guests. However, without VLANs, the tun interface will pass packets to all tap interfaces. It has to, as it doesn't know to which one the packet has to go to. It does not look at packets, it simply copies buffers from userspace to the tap interface in the kernel. The tap interface then eventually drops the packet, if the MAC address does not match its own. So packets will not actually go across both bridges, because the tap interface that should not receive the packet does drop it. However, it does receive the packet and processes it to some extend which causes some overhead. As I was told by someone at KVM/RedHat, this does not happen when using VLANs as then there will be a direct mapping between any tun-tap device and so no packet replication across multiple tap devices. This only happens if the receiving tap never sends out packets. If the tap interface does send out packets, the bridge will associate their MAC address with that interface, and future packets will only be forwarded there. Is this your scenario? Not sure I understand. As far as I can see the packets are replicated on the tun/tap interface before they actually enter the bridge. So this is not about the bridge learning MAC addresses and flooding frames to unknown destinations. So I think this is different. Anna -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [KVM-AUTOTEST PATCH] A test patch - Boot VMs until one of them becomes unresponsive
- Yolkfull Chow yz...@redhat.com wrote: Michael, these are the backtrace messages: ... 20090611-064959 no_boundary.local_stg.RHEL.5.3-server-64.no_ksm.boot_vms.e1000.user.size_1024: ERROR: run_once: Test failed: [Errno 12] Cannot allocate memory 20090611-064959 no_boundary.local_stg.RHEL.5.3-server-64.no_ksm.boot_vms.e1000.user.size_1024: DEBUG: run_once: Postprocessing on error... 20090611-065000 no_boundary.local_stg.RHEL.5.3-server-64.no_ksm.boot_vms.e1000.user.size_1024: DEBUG: postprocess_vm: Postprocessing VM 'vm1'... 20090611-065000 no_boundary.local_stg.RHEL.5.3-server-64.no_ksm.boot_vms.e1000.user.size_1024: DEBUG: postprocess_vm: VM object found in environment 20090611-065000 no_boundary.local_stg.RHEL.5.3-server-64.no_ksm.boot_vms.e1000.user.size_1024: DEBUG: send_monitor_cmd: Sending monitor command: screendump /kvm-autotest/client/results/default/kvm_runtest_2.[RHEL-Server-5.3-64][None][1024][1][qcow2]no_boundary.local_stg.RHEL.5.3-server-64.no_ksm.boot_vms.e1000.user.size_1024/debug/post_vm1.ppm 20090611-065000 no_boundary.local_stg.RHEL.5.3-server-64.no_ksm.boot_vms.e1000.user.size_1024: DEBUG: run_once: Contents of environment: {'vm__vm1': kvm_vm.VM instance at 0x92999a28} post-test sysinfo error: Traceback (most recent call last): File /kvm-autotest/client/common_lib/log.py, line 58, in decorated_func fn(*args, **dargs) File /kvm-autotest/client/bin/base_sysinfo.py, line 213, in log_after_each_test log.run(test_sysinfodir) File /kvm-autotest/client/bin/base_sysinfo.py, line 112, in run shell=True, env=env) File /usr/lib64/python2.4/subprocess.py, line 412, in call return Popen(*args, **kwargs).wait() File /usr/lib64/python2.4/subprocess.py, line 542, in __init__ errread, errwrite) File /usr/lib64/python2.4/subprocess.py, line 902, in _execute_child self.pid = os.fork() OSError: [Errno 12] Cannot allocate memory 2009-06-11 06:50:02,859 Configuring logger for client level FAIL kvm_runtest_2.[RHEL-Server-5.3-64][None][1024][1][qcow2]no_boundary.local_stg.RHEL.5.3-server-64.no_ksm.boot_vms.e1000.user.size_1024 kvm_runtest_2.[RHEL-Server-5.3-64][None][1024][1][qcow2]no_boundary.local_stg.RHEL.5.3-server-64.no_ksm.boot_vms.e1000.user.size_1024 timestamp=1244717402localtime=Jun 11 06:50:02Unhandled OSError: [Errno 12] Cannot allocate memory Traceback (most recent call last): File /kvm-autotest/client/common_lib/test.py, line 304, in _exec self.execute(*p_args, **p_dargs) File /kvm-autotest/client/common_lib/test.py, line 187, in execute self.run_once(*args, **dargs) File /kvm-autotest/client/tests/kvm_runtest_2/kvm_runtest_2.py, line 145, in run_once routine_obj.routine(self, params, env) File /kvm-autotest/client/tests/kvm_runtest_2/kvm_tests.py, line 3071, in run_boot_vms curr_vm_session = kvm_utils.wait_for(curr_vm.ssh_login, 240, 0, 2) File /kvm-autotest/client/tests/kvm_runtest_2/kvm_utils.py, line 797, in wait_for output = func() File /kvm-autotest/client/tests/kvm_runtest_2/kvm_vm.py, line 728, in ssh_login session = kvm_utils.ssh(address, port, username, password, prompt, timeout) File /kvm-autotest/client/tests/kvm_runtest_2/kvm_utils.py, line 553, in ssh return remote_login(command, password, prompt, \n, timeout) File /kvm-autotest/client/tests/kvm_runtest_2/kvm_utils.py, line 431, in remote_login sub = kvm_spawn(command, linesep) File /kvm-autotest/client/tests/kvm_runtest_2/kvm_utils.py, line 114, in __init__ (pid, fd) = pty.fork() File /usr/lib64/python2.4/pty.py, line 108, in fork pid = os.fork() OSError: [Errno 12] Cannot allocate memory Persistent state variable __group_level now set to 1 END FAIL kvm_runtest_2.[RHEL-Server-5.3-64][None][1024][1][qcow2]no_boundary.local_stg.RHEL.5.3-server-64.no_ksm.boot_vms.e1000.user.size_1024 kvm_runtest_2.[RHEL-Server-5.3-64][None][1024][1][qcow2]no_boundary.local_stg.RHEL.5.3-server-64.no_ksm.boot_vms.e1000.user.size_1024 timestamp=1244717403localtime=Jun 11 06:50:03 Dropping caches 2009-06-11 06:50:03,409 running: sync JOB ERROR: Unhandled OSError: [Errno 12] Cannot allocate memory Traceback (most recent call last): File /kvm-autotest/client/bin/job.py, line 978, in step_engine execfile(self.control, global_control_vars, global_control_vars) File /kvm-autotest/client/control, line 1030, in ? cfg_to_test(kvm_tests.cfg) File /kvm-autotest/client/control, line 1013, in cfg_to_test current_status = job.run_test(kvm_runtest_2, params=dict, tag=tagname) File /kvm
Re: [PATCH 0/4] qemu-kvm cleanup
Glauber Costa wrote: This series do some more cleanups in qemu-kvm.c I decided it is better to clean it up in place a little bit before merging it to kvm-all.c it is dependant on my previous patch: move libkvm-all.c code to qemu-kvm.c I don't see that patch. Where is it? -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/4] cleanup mmio coalescing functions
Glauber Costa wrote: remove wrappers that existed only due to qemu/libkvm separation. Use qemu types for function definitions. -int kvm_register_coalesced_mmio(kvm_context_t kvm, uint64_t addr, uint32_t size) +int kvm_coalesce_mmio_region(target_phys_addr_t addr, ram_addr_t size) { #ifdef KVM_CAP_COALESCED_MMIO +kvm_context_t kvm = kvm_context; While all this code is doomed, please maintain consistent indentation while it lives. struct kvm_coalesced_mmio_zone zone; int r; @@ -1121,9 +1122,10 @@ int kvm_register_coalesced_mmio(kvm_context_t kvm, uint64_t addr, uint32_t size) return -ENOSYS; } -int kvm_unregister_coalesced_mmio(kvm_context_t kvm, uint64_t addr, uint32_t size) +int kvm_uncoalesce_mmio_region(target_phys_addr_t addr, ram_addr_t size) { #ifdef KVM_CAP_COALESCED_MMIO +kvm_context_t kvm = kvm_context; Here too. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] qemu-kvm cleanup
Avi Kivity wrote: Glauber Costa wrote: This series do some more cleanups in qemu-kvm.c I decided it is better to clean it up in place a little bit before merging it to kvm-all.c it is dependant on my previous patch: move libkvm-all.c code to qemu-kvm.c I don't see that patch. Where is it? Apart from my little comment and the missing prerequisite, this series looks good. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [KVM-AUTOTEST PATCH] A test patch - Boot VMs until one of them becomes unresponsive
On 06/11/2009 04:53 PM, Michael Goldish wrote: - Yolkfull Chowyz...@redhat.com wrote: Michael, these are the backtrace messages: ... 20090611-064959 no_boundary.local_stg.RHEL.5.3-server-64.no_ksm.boot_vms.e1000.user.size_1024: ERROR: run_once: Test failed: [Errno 12] Cannot allocate memory 20090611-064959 no_boundary.local_stg.RHEL.5.3-server-64.no_ksm.boot_vms.e1000.user.size_1024: DEBUG: run_once: Postprocessing on error... 20090611-065000 no_boundary.local_stg.RHEL.5.3-server-64.no_ksm.boot_vms.e1000.user.size_1024: DEBUG: postprocess_vm: Postprocessing VM 'vm1'... 20090611-065000 no_boundary.local_stg.RHEL.5.3-server-64.no_ksm.boot_vms.e1000.user.size_1024: DEBUG: postprocess_vm: VM object found in environment 20090611-065000 no_boundary.local_stg.RHEL.5.3-server-64.no_ksm.boot_vms.e1000.user.size_1024: DEBUG: send_monitor_cmd: Sending monitor command: screendump /kvm-autotest/client/results/default/kvm_runtest_2.[RHEL-Server-5.3-64][None][1024][1][qcow2]no_boundary.local_stg.RHEL.5.3-server-64.no_ksm.boot_vms.e1000.user.size_1024/debug/post_vm1.ppm 20090611-065000 no_boundary.local_stg.RHEL.5.3-server-64.no_ksm.boot_vms.e1000.user.size_1024: DEBUG: run_once: Contents of environment: {'vm__vm1':kvm_vm.VM instance at 0x92999a28} post-test sysinfo error: Traceback (most recent call last): File /kvm-autotest/client/common_lib/log.py, line 58, in decorated_func fn(*args, **dargs) File /kvm-autotest/client/bin/base_sysinfo.py, line 213, in log_after_each_test log.run(test_sysinfodir) File /kvm-autotest/client/bin/base_sysinfo.py, line 112, in run shell=True, env=env) File /usr/lib64/python2.4/subprocess.py, line 412, in call return Popen(*args, **kwargs).wait() File /usr/lib64/python2.4/subprocess.py, line 542, in __init__ errread, errwrite) File /usr/lib64/python2.4/subprocess.py, line 902, in _execute_child self.pid = os.fork() OSError: [Errno 12] Cannot allocate memory 2009-06-11 06:50:02,859 Configuring logger for client level FAIL kvm_runtest_2.[RHEL-Server-5.3-64][None][1024][1][qcow2]no_boundary.local_stg.RHEL.5.3-server-64.no_ksm.boot_vms.e1000.user.size_1024 kvm_runtest_2.[RHEL-Server-5.3-64][None][1024][1][qcow2]no_boundary.local_stg.RHEL.5.3-server-64.no_ksm.boot_vms.e1000.user.size_1024 timestamp=1244717402localtime=Jun 11 06:50:02Unhandled OSError: [Errno 12] Cannot allocate memory Traceback (most recent call last): File /kvm-autotest/client/common_lib/test.py, line 304, in _exec self.execute(*p_args, **p_dargs) File /kvm-autotest/client/common_lib/test.py, line 187, in execute self.run_once(*args, **dargs) File /kvm-autotest/client/tests/kvm_runtest_2/kvm_runtest_2.py, line 145, in run_once routine_obj.routine(self, params, env) File /kvm-autotest/client/tests/kvm_runtest_2/kvm_tests.py, line 3071, in run_boot_vms curr_vm_session = kvm_utils.wait_for(curr_vm.ssh_login, 240, 0, 2) File /kvm-autotest/client/tests/kvm_runtest_2/kvm_utils.py, line 797, in wait_for output = func() File /kvm-autotest/client/tests/kvm_runtest_2/kvm_vm.py, line 728, in ssh_login session = kvm_utils.ssh(address, port, username, password, prompt, timeout) File /kvm-autotest/client/tests/kvm_runtest_2/kvm_utils.py, line 553, in ssh return remote_login(command, password, prompt, \n, timeout) File /kvm-autotest/client/tests/kvm_runtest_2/kvm_utils.py, line 431, in remote_login sub = kvm_spawn(command, linesep) File /kvm-autotest/client/tests/kvm_runtest_2/kvm_utils.py, line 114, in __init__ (pid, fd) = pty.fork() File /usr/lib64/python2.4/pty.py, line 108, in fork pid = os.fork() OSError: [Errno 12] Cannot allocate memory Persistent state variable __group_level now set to 1 END FAIL kvm_runtest_2.[RHEL-Server-5.3-64][None][1024][1][qcow2]no_boundary.local_stg.RHEL.5.3-server-64.no_ksm.boot_vms.e1000.user.size_1024 kvm_runtest_2.[RHEL-Server-5.3-64][None][1024][1][qcow2]no_boundary.local_stg.RHEL.5.3-server-64.no_ksm.boot_vms.e1000.user.size_1024 timestamp=1244717403localtime=Jun 11 06:50:03 Dropping caches 2009-06-11 06:50:03,409 running: sync JOB ERROR: Unhandled OSError: [Errno 12] Cannot allocate memory Traceback (most recent call last): File /kvm-autotest/client/bin/job.py, line 978, in step_engine execfile(self.control, global_control_vars, global_control_vars) File /kvm-autotest/client/control, line 1030, in ? cfg_to_test(kvm_tests.cfg) File /kvm-autotest/client/control, line 1013, in cfg_to_test current_status = job.run_test(kvm_runtest_2, params=dict, tag=tagname) File /kvm-autotest/client/bin/job.py, line 44, in wrapped utils.drop_caches
Re: [PATCH 2/2] kvm: change the dirty page tracking to work with dirty bity
Avi Kivity wrote: Izik Eidus wrote: change the dirty page tracking to work with dirty bity instead of page fault. right now the dirty page tracking work with the help of page faults, when we want to track a page for being dirty, we write protect it and we mark it dirty when we have write page fault, this code move into looking at the dirty bit of the spte. I'm concerned about performance during the later stages of live migration. Even if only 1000 pages are dirty, you still have to look at 2,000,000 or more ptes (for an 8GB guest). That's a lot of overhead. I think we need to use the page table hierarchy, write protect the upper page table so we know which page tables we need to look at. Great idea, so i add another bitmap for the page directory? +static int vmx_dirty_bit_support(void) +{ +return false; +} It's false only when ept is enabled. Yea, that i found out already -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] kvm: change the dirty page tracking to work with dirty bity
Izik Eidus wrote: Avi Kivity wrote: Izik Eidus wrote: change the dirty page tracking to work with dirty bity instead of page fault. right now the dirty page tracking work with the help of page faults, when we want to track a page for being dirty, we write protect it and we mark it dirty when we have write page fault, this code move into looking at the dirty bit of the spte. I'm concerned about performance during the later stages of live migration. Even if only 1000 pages are dirty, you still have to look at 2,000,000 or more ptes (for an 8GB guest). That's a lot of overhead. I think we need to use the page table hierarchy, write protect the upper page table so we know which page tables we need to look at. Great idea, so i add another bitmap for the page directory? No, why? You need to drop write access to the shadow root ptes. When you get a fault, restore write access to the root ptes, but drop access from the L3 ptes, and so on until you reach the L1 ptes. There you clear the dirty bits, and add the page to a list of pages that need to be checked for dirty bits. This way you only check ptes that have a chance to be dirty. I'm not sure that will be faster, but there's a good chance. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
pcidevice: failed to assign irq / hang on Intel nic boot message
Hello, I'm trying to setup a virtual machine with my onboard nic passed through. Unfortunately I get the message: Failed to assign irq for 2:00.0: Input/output error Perhaps you are assigning a device that shares an IRQ with another device? I'm using a clean Ubuntu 9.04 installation (64 bits version) which comes with kernel `2.6.28-11-generic'. The kvm version is 84 and I'm using an AMD cpu. The relevant section from `dmesg' shows that IRQ 18 is used, but that by MSI/MSI-X also IRQ 2300 is assigned. IRQ 18 is also used for onboard usb devices and the graphics card. I need these, so I can't shut them down unfortunately. # relevant snipped from dmesg after booting the machine: [4.123477] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded [4.123498] r8169 :02:00.0: PCI INT A - GSI 18 (level, low) - IRQ 18 [4.123514] r8169 :02:00.0: setting latency timer to 64 [4.123593] r8169 :02:00.0: irq 2300 for MSI/MSI-X The procedure I followed was: # unbind the onboard nic cd /sys/bus/pci/devices/:02:00.0/driver echo -n :02:00.0 unbind # try to run kvm kvm -m 512 -hda /dev/vg/vm1 -cdrom ubuntu-9.04-server-amd64.iso -pcidevice host=2:00.0 -boot d Failed to assign irq for 2:00.0: Input/output error Perhaps you are assigning a device that shares an IRQ with another device? # The following appeared in dmesg: [ 1220.744178] r8169 :02:00.0: PCI INT A disabled [ 1242.066374] pci :02:00.0: PCI INT A - GSI 18 (level, low) - IRQ 18 [ 1242.02] pci :02:00.0: Invalid ROM contents [ 1242.761820] kvm: 5930: cpu0 unhandled wrmsr: 0xc0010117 data 0 [ 1243.389097] pci :02:00.0: PCI INT A disabled Does this mean that for my setup it isn't possible to use the onboard nic as a pcidevice for kvm? Anything I missed, or suggestions to try? Another thing I tried was using another nic (an intel pro/100) as pcidevice. This seemed to work, I didn't get any complaints (that nic didn't share any irq's with other devices). The QEMU window appears and the virtual machine boots. But the Intel nic shows a message during booting: Initializing Intel PRO/100 Boot Agent Version 2.0 Press Ctrl+S to enter the Setup Program.. While the progress dots (behind the word `Program' in the last sentence) appeared, QEMU didn't advance beyond this point. Pressing Ctrl+S brought me into the set program, but disabling the boot message didn't make any difference (and while the message was disabled it was still possible to enter the setup program using Ctrl+S). Is this a known problem? Any work arround for this? Thanks for any help/insights, Heiko -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[GIT PULL] Merge latest qemu.git
Hi Avi, The conflicts with the networking changes just pushed to qemu.git are fairly involved to resolve, so I thought I'd try and save you the pain. Below is a pull request which merges in the latest, and does it in the way we recently discussed - each conflict is resolved in a separate merge commit. I've build tested the x86_64-softmmu target and done some light networking testing. HTH, Mark. The following changes since commit b6810dec0ea5c9e90e90404424458918972853d8: Avi Kivity (1): Regenerate bios for MADT/RSDT fixes are available in the git repository at: git://git.et.redhat.com/qemu-net.git for-avi Alex Williamson (7): virtio-net: Add version_id 7 placeholder for vnet header support virtio-net: Use a byte to store RX mode flags virtio-net: reorganize receive_filter() virtio-net: Fix MAC filter overflow handling virtio-net: MAC filter optimization virtio-net: Add new RX filter controls virtio-net: Increase filter and control limits Anthony Liguori (2): Merge branch 'net-queue' Fix build breakage when using VDE introduced by 4f1c942 Blue Swirl (11): Use hxtool to generate monitor documentation and C structures Fix generation of CONFIG_KVM Register reset functions for e1000 and rtl8139 Update irqs on reset and device load Remove unused and misnamed field and variable Fix warning microblaze-dis.c does not need to be executable Fix Sparse warning Clean up generated qemu-img-cmds.h Fix Sparse warning Use snprintf to avoid OpenBSD warning Edgar E. Iglesias (2): microblaze: Fix loading of petalogix s3adsp1800 dtb. CRIS: Remove duplicated flag defines. Gerd Hoffmann (6): qdev: kill DeviceState-name qdev: add monitor command to dump the tree. xen: net backend doesn't need linux headers. xen nic: use qemu_malloc xen nic: use XC_PAGE_SIZE instead of PAGE_SIZE. qdev: c99 initilaizers for bus_type_names Jan Kiszka (7): kvm: Improve upgrade notes when facing unsupported kernels net: Don't deliver to disabled interfaces in qemu_sendv_packet net: Fix and improved ordered packet delivery slirp: Avoid zombie processes after fork_exec net: Real fix for check_params users net: Improve parameter error reporting slirp: Reorder initialization Kevin Wolf (2): qemu-img: Print available options with -o ? Document changes in qemu-img interface Luiz Capitulino (5): monitor: Remove uneeded goto monitor: Remove uneeded 'return' statement monitor: Remove unused variable monitor: Introduce get_command_name() Fix defined but not used warning Mark McLoughlin (22): Revert Fix output of uninitialized strings net: fix error reporting for some net parameter checks net: factor tap_read_packet() out of tap_send() net: move the tap buffer into TAPState net: vlan clients with no fd_can_read() can always receive net: only read from tapfd when we can send net: add fd_readv() handler to qemu_new_vlan_client() args net: re-name vc-fd_read() to vc-receive() net: pass VLANClientState* as first arg to receive handlers net: add return value to packet receive handler net: return status from qemu_deliver_packet() net: split out packet queueing and flushing into separate functions net: add qemu_send_packet_async() net: make use of async packet sending API in tap client virtio-net: implement rx packet queueing Merge branch 'master' of git://git.sv.gnu.org/qemu Merge branch 'master' of git://git.sv.gnu.org/qemu Merge branch 'master' of git://git.sv.gnu.org/qemu Merge branch 'master' of git://git.sv.gnu.org/qemu Merge branch 'master' of git://git.sv.gnu.org/qemu Merge branch 'master' of git://git.sv.gnu.org/qemu Merge branch 'master' of git://git.sv.gnu.org/qemu Nathan Froyd (1): fix gdbstub support for multiple threads in usermode, v3 Paul Brook (9): Use relative path for bios Implement multiple samplers on stellaris ADC Stellaris qdev conversion Add --enable-debug Remove ARM NVIC initialization hack Fix elf loader range checking Record device property types Fix typo Use correct type for SPARC cpu_cc_op Stefan Weil (2): Fix spelling in comment. doc: Update information on supported network adapters. Stuart Brady (1): Use hxtool for qemu-img command list .gitignore |2 + Makefile | 22 +- Makefile.target |8 +- block/cow.c | 12 +- block/qcow.c | 18 +- block/qcow2.c| 30 ++- block/raw-posix.c|6 +- block/raw-win32.c|6 +- block/vmdk.c | 18 +-
[KVM-AUTOTEST][PATCH] Enable running test(s) multiple times (iterations)
The following patch did not make it in the merge. I've been waiting for the merge to stabilize first. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[KVM-AUTOTEST][PATCH] Enable running test(s) multiple times (iterations)
From: Supriya Kannery supri...@in.ibm.com Default is to run each test once. Just add iterations = N in kvm_tests.cfg to the test(s) you want to run multiple times. Signed-off-by: Supriya Kannery supri...@in.ibm.com Cc : Michael Goldish mgold...@redhat.com Signed-off-by: Uri Lublin u...@redhat.com --- client/tests/kvm/control |4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/client/tests/kvm/control b/client/tests/kvm/control index b3543ee..c030a14 100644 --- a/client/tests/kvm/control +++ b/client/tests/kvm/control @@ -145,8 +145,10 @@ for dict in list: dependencies_satisfied = False break if dependencies_satisfied: +test_iterations=int(dict.get(iterations, 1)) current_status = job.run_test(kvm, params=dict, - tag=dict.get(shortname)) + tag=dict.get(shortname), + iterations=test_iterations) else: current_status = False status_dict[dict.get(name)] = current_status -- 1.6.2.2 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PULL] Merge latest qemu.git
Mark McLoughlin wrote: Hi Avi, The conflicts with the networking changes just pushed to qemu.git are fairly involved to resolve, so I thought I'd try and save you the pain. Below is a pull request which merges in the latest, and does it in the way we recently discussed - each conflict is resolved in a separate merge commit. I've build tested the x86_64-softmmu target and done some light networking testing. Pulled, thanks. HTH, Very much. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
qemu-kvm broken after ./configure --disable-kvm
Building latest git with ./configure --disable-kvm breaks with errors in pcspk.c -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[KVM-AUTOTEST] New test module: iperf
Hello KVM-Autotest users developers, I want to present a new KVM-Autotest module here: kvm_iperf. Basically it tests networking functionality, stability and performance of guest OSes. This test is cross-platform -- i.e. it works on both Linux and Windows VMs. I was under development since some time, and now I feel it is mature for a release. The test is dependent on python and KVM-Autotest framework. Basically the module consists of kvm_iperf.py, and small modifications to kvm_runtest_2.py and kvm_tests.py. You will also need to create a new misc subdirectory inside your kvm_runtest_2/. (# mkdir kvm_runtest_2/misc) And put two files there: 1. iperf -- this one must be Linux i586 binary 2. iperf.exe -- this one must be Win32 binary optionally third file: 3. iperf64 -- this one must be Linux x64 binary On Linux platform we could compile on the fly, but I decided not to (for now), because some Linux guests do not have a compiler. Also using i586 binary ensures that it can be used on both i586 and x64 guests. But you can use both, if you modify kvm_tests.py accordingly. optional parameters: iperf_duration -- allows to specify long test durations (default = 5 sec) iperf_parallel_threads -- allows to test multiple network threads in parallel (default = 1 thread) Theoretically in future we could support other UNIXes as well (BSD and Solaris). For this you will need only to add respective binaries in misc/ folder, and modify kvm_tests.py for respective OSes. iperf: http://sourceforge.net/projects/iperf - source for latest. iperf for Windows: http://dast.nlanr.net/Projects/Iperf/iperf-1.7.0-win32.exe - stable binary. http://noc.pregi.net/iperf.html To commit this module we may need to commit iPerf binaries, which I hate to do. but I have no better idea for now. Please review it commit it. -Alexey Eromenko kvm_iperf.py Description: Binary data kvm_runtest_2.py.patch Description: Binary data kvm_tests.cfg.sample.2009-04-26 Description: Binary data kvm_tests.cfg.sample.2009-04-26.iperf.patch Description: Binary data kvm_tests.cfg.sample.2009-04-26.iperf.patched Description: Binary data
Re: [libvirt] Re: [CentOS-devel] Latest kvm packages for CentOS 5.3
On Wed, Jun 10, 2009 at 04:50:25PM +0200, Dag Wieers wrote: On Wed, 10 Jun 2009, Federico Simoncelli wrote: I've been working quite extensively with kvm on CentOS 5.3 lately. If you are interested in the latest rpm of kvm-kmod-2.6.30-rc8, qemu-kvm-0.10.5 and libvirt-0.6.4 you can temporary find them here: http://update.nethesis.it/kvm/ I've had no problem so far using these packages. Feedback is welcome. RHEL5.4 is expected to have KVM support, so it would be nice to know in advance which version is being included with RHEL 5.4. Then we can update our own CentOS kvm kmod for testing and reporting upstream the issue(s) we still find. That version info will become available when RHEL-5.4 beta ships in the not too distant future... Regards, Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :| -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [KVM PATCH v10] kvm: add support for irqfd
Going over this code again, I seem to see a minor error handling issue here: On Wed, May 20, 2009 at 10:30:49AM -0400, Gregory Haskins wrote: diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c new file mode 100644 index 000..72a282e --- /dev/null +++ b/virt/kvm/eventfd.c @@ -0,0 +1,228 @@ +/* + * kvm eventfd support - use eventfd objects to signal various KVM events + * + * Copyright 2009 Novell. All Rights Reserved. + * + * Author: + * Gregory Haskins ghask...@novell.com + * + * This file is free software; you can redistribute it and/or modify + * it under the terms of version 2 of the GNU General Public License + * as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software Foundation, + * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA. + */ + +#include linux/kvm_host.h +#include linux/workqueue.h +#include linux/syscalls.h +#include linux/wait.h +#include linux/poll.h +#include linux/file.h +#include linux/list.h + +/* + * + * irqfd: Allows an fd to be used to inject an interrupt to the guest + * + * Credit goes to Avi Kivity for the original idea. + * + */ +struct _irqfd { + struct kvm *kvm; + int gsi; + struct file *file; + struct list_head list; + poll_tablept; + wait_queue_head_t*wqh; + wait_queue_t wait; + struct work_structwork; +}; + +static void +irqfd_inject(struct work_struct *work) +{ + struct _irqfd *irqfd = container_of(work, struct _irqfd, work); + struct kvm *kvm = irqfd-kvm; + + mutex_lock(kvm-lock); + kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID, irqfd-gsi, 1); + kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID, irqfd-gsi, 0); + mutex_unlock(kvm-lock); +} + +static int +irqfd_wakeup(wait_queue_t *wait, unsigned mode, int sync, void *key) +{ + struct _irqfd *irqfd = container_of(wait, struct _irqfd, wait); + + /* + * The wake_up is called with interrupts disabled. Therefore we need + * to defer the IRQ injection until later since we need to acquire the + * kvm-lock to do so. + */ + schedule_work(irqfd-work); + + return 0; +} + +static void +irqfd_ptable_queue_proc(struct file *file, wait_queue_head_t *wqh, + poll_table *pt) +{ + struct _irqfd *irqfd = container_of(pt, struct _irqfd, pt); + + irqfd-wqh = wqh; + add_wait_queue(wqh, irqfd-wait); +} + +static int +kvm_assign_irqfd(struct kvm *kvm, int fd, int gsi) +{ + struct _irqfd *irqfd; + struct file *file = NULL; + int ret; + + irqfd = kzalloc(sizeof(*irqfd), GFP_KERNEL); + if (!irqfd) + return -ENOMEM; + + irqfd-kvm = kvm; + irqfd-gsi = gsi; + INIT_LIST_HEAD(irqfd-list); + INIT_WORK(irqfd-work, irqfd_inject); + + /* + * Embed the file* lifetime in the irqfd. + */ + file = fget(fd); + if (IS_ERR(file)) { + ret = PTR_ERR(file); + goto fail; + } + + /* + * Install our own custom wake-up handling so we are notified via + * a callback whenever someone signals the underlying eventfd + */ + init_waitqueue_func_entry(irqfd-wait, irqfd_wakeup); + init_poll_funcptr(irqfd-pt, irqfd_ptable_queue_proc); + + ret = file-f_op-poll(file, irqfd-pt); + if (ret 0) + goto fail; + + irqfd-file = file; + + mutex_lock(kvm-lock); + list_add_tail(irqfd-list, kvm-irqfds); + mutex_unlock(kvm-lock); + + return 0; + +fail: + if (irqfd-wqh) + remove_wait_queue(irqfd-wqh, irqfd-wait); Why are these 2 lines here? Either we might get a callback even though poll failed - and then this test without lock is probably racy - or we can't, and then we can replace the above with BUG_ON(irqfd-wqh). Which is it? I think the later ... + + if (file !IS_ERR(file)) + fput(file); + + kfree(irqfd); + return ret; +} + +static void +irqfd_release(struct _irqfd *irqfd) +{ + /* + * The ordering is important. We must remove ourselves from the wqh + * first to ensure no more event callbacks are issued, and then flush + * any previously scheduled work prior to freeing the memory + */ + remove_wait_queue(irqfd-wqh, irqfd-wait); + +
Re: [KVM PATCH v10] kvm: add support for irqfd
On Thu, Jun 11, 2009 at 04:16:47PM +0300, Michael S. Tsirkin wrote: + + ret = file-f_op-poll(file, irqfd-pt); + if (ret 0) + goto fail; Looking at it some more, we have: struct file_operations { unsigned int (*poll) (struct file *, struct poll_table_struct *); So the comparison above does not seem to make sense: it seems that the return value from poll can not be negative. Will the callback be executed if someone did a write to eventfd before we attached it? If no, maybe we should call it here if ret != 0. + + irqfd-file = file; + + mutex_lock(kvm-lock); + list_add_tail(irqfd-list, kvm-irqfds); + mutex_unlock(kvm-lock); + + return 0; + +fail: + if (irqfd-wqh) + remove_wait_queue(irqfd-wqh, irqfd-wait); Why are these 2 lines here? Either we might get a callback even though poll failed - and then this test without lock is probably racy - or we can't, and then we can replace the above with BUG_ON(irqfd-wqh). Which is it? I think the later ... + + if (file !IS_ERR(file)) + fput(file); + + kfree(irqfd); + return ret; +} + -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: qemu-kvm broken after ./configure --disable-kvm
Beth Kon wrote: Building latest git with ./configure --disable-kvm breaks with errors in pcspk.c With latest git, things break much earlier in case your host does not provide linux/kvm.h because libkvm-all.h includes it unconditionally. I would like to push this task to Glauber as he is already shuffling around most of the involved code: Could you have a look on --disable-kvm too while you are at it? My basic idea would be to get rid of direct qemu-kvm.h includes so that you always obtain the required [proto]types by including kvm.h, independent of CONFIG_KVM and already prepared for upstream where there is no qemu-kvm.h. Regarding the bugs I left behind in pcspk.c, I would suggest something like diff --git a/hw/pcspk.c b/hw/pcspk.c index 9e1b59a..5b624d1 100644 --- a/hw/pcspk.c +++ b/hw/pcspk.c @@ -51,10 +51,9 @@ static const char *s_spk = pcspk; static PCSpkState pcspk_state; #ifdef USE_KVM_PIT -static void kvm_get_pit_ch2(PITState *pit, -struct kvm_pit_state *inkernel_state) +static void kvm_get_pit_ch2(PITState *pit, KVMPITState *inkernel_state) { -struct kvm_pit_state pit_state; +KVMPITState pit_state; if (kvm_enabled() qemu_kvm_pit_in_kernel()) { kvm_get_pit(kvm_context, pit_state); @@ -68,8 +67,7 @@ static void kvm_get_pit_ch2(PITState *pit, } } -static void kvm_set_pit_ch2(PITState *pit, -struct kvm_pit_state *inkernel_state) +static void kvm_set_pit_ch2(PITState *pit, KVMPITState *inkernel_state) { if (kvm_enabled() qemu_kvm_pit_in_kernel()) { inkernel_state-channels[2].mode = pit-channels[2].mode; @@ -82,9 +80,9 @@ static void kvm_set_pit_ch2(PITState *pit, } #else static inline void kvm_get_pit_ch2(PITState *pit, - kvm_pit_state *inkernel_state) { } + KVMPITState *inkernel_state) { } static inline void kvm_set_pit_ch2(PITState *pit, - kvm_pit_state *inkernel_state) { } + KVMPITState *inkernel_state) { } #endif static inline void generate_samples(PCSpkState *s) @@ -168,7 +166,7 @@ static uint32_t pcspk_ioport_read(void *opaque, uint32_t addr) static void pcspk_ioport_write(void *opaque, uint32_t addr, uint32_t val) { -struct kvm_pit_state inkernel_state; +KVMPITState inkernel_state; PCSpkState *s = opaque; const int gate = val 1; where KVMPITState is defined as #ifdef KVM_CAP_PIT typedef struct kvm_pit_state KVMPITState; #else typedef struct { } KVMPITState; #endif Thanks, Jan signature.asc Description: OpenPGP digital signature
[patch 5/5] KVM: VMX: conditionally disable 2M pages
Disable usage of 2M pages if VMX_EPT_2MB_PAGE_BIT (bit 16) is clear in MSR_IA32_VMX_EPT_VPID_CAP and EPT is enabled. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com Index: kvm/arch/x86/kvm/vmx.c === --- kvm.orig/arch/x86/kvm/vmx.c +++ kvm/arch/x86/kvm/vmx.c @@ -1393,6 +1393,9 @@ static __init int hardware_setup(void) if (!cpu_has_vmx_tpr_shadow()) kvm_x86_ops-update_cr8_intercept = NULL; + if (enable_ept !cpu_has_vmx_ept_2m_page()) + kvm_disable_largepages(); + return alloc_kvm_area(); } Index: kvm/include/linux/kvm_host.h === --- kvm.orig/include/linux/kvm_host.h +++ kvm/include/linux/kvm_host.h @@ -219,6 +219,7 @@ int kvm_arch_set_memory_region(struct kv struct kvm_userspace_memory_region *mem, struct kvm_memory_slot old, int user_alloc); +void kvm_disable_largepages(void); void kvm_arch_flush_shadow(struct kvm *kvm); gfn_t unalias_gfn(struct kvm *kvm, gfn_t gfn); struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn); Index: kvm/virt/kvm/kvm_main.c === --- kvm.orig/virt/kvm/kvm_main.c +++ kvm/virt/kvm/kvm_main.c @@ -85,6 +85,8 @@ static long kvm_vcpu_ioctl(struct file * static bool kvm_rebooting; +static bool largepages_disabled = false; + #ifdef KVM_CAP_DEVICE_ASSIGNMENT static struct kvm_assigned_dev_kernel *kvm_find_assigned_dev(struct list_head *head, int assigned_dev_id) @@ -1171,9 +1173,11 @@ int __kvm_set_memory_region(struct kvm * ugfn = new.userspace_addr PAGE_SHIFT; /* * If the gfn and userspace address are not aligned wrt each -* other, disable large page support for this slot +* other, or if explicitly asked to, disable large page +* support for this slot */ - if ((base_gfn ^ ugfn) (KVM_PAGES_PER_HPAGE - 1)) + if ((base_gfn ^ ugfn) (KVM_PAGES_PER_HPAGE - 1) || + largepages_disabled) for (i = 0; i largepages; ++i) new.lpage_info[i].write_count = 1; } @@ -1286,6 +1290,12 @@ out: return r; } +void kvm_disable_largepages(void) +{ + largepages_disabled = true; +} +EXPORT_SYMBOL_GPL(kvm_disable_largepages); + int is_error_page(struct page *page) { return page == bad_page; -- -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 3/5] KVM: MMU: add kvm_mmu_get_spte_hierarchy helper
Required by EPT misconfiguration handler. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com Index: kvm/arch/x86/kvm/mmu.c === --- kvm.orig/arch/x86/kvm/mmu.c +++ kvm/arch/x86/kvm/mmu.c @@ -3013,6 +3013,24 @@ out: return r; } +int kvm_mmu_get_spte_hierarchy(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes[4]) +{ + struct kvm_shadow_walk_iterator iterator; + int nr_sptes = 0; + + spin_lock(vcpu-kvm-mmu_lock); + for_each_shadow_entry(vcpu, addr, iterator) { + sptes[iterator.level-1] = iterator.sptep; + nr_sptes++; + if (!is_shadow_present_pte(*iterator.sptep)) + break; + } + spin_unlock(vcpu-kvm-mmu_lock); + + return nr_sptes; +} +EXPORT_SYMBOL_GPL(kvm_mmu_get_spte_hierarchy); + #ifdef AUDIT static const char *audit_msg; Index: kvm/arch/x86/kvm/mmu.h === --- kvm.orig/arch/x86/kvm/mmu.h +++ kvm/arch/x86/kvm/mmu.h @@ -37,6 +37,8 @@ #define PT32_ROOT_LEVEL 2 #define PT32E_ROOT_LEVEL 3 +int kvm_mmu_get_spte_hierarchy(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes[4]); + static inline void kvm_mmu_free_some_pages(struct kvm_vcpu *vcpu) { if (unlikely(vcpu-kvm-arch.n_free_mmu_pages KVM_MIN_FREE_MMU_PAGES)) -- -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 2/5] KVM: MMU: make for_each_shadow_entry aware of largepages
On Wed, Jun 10, 2009 at 12:21:05PM +0300, Avi Kivity wrote: Avi Kivity wrote: Marcelo Tosatti wrote: This way there is no need to add explicit checks in every for_each_shadow_entry user. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com Index: kvm/arch/x86/kvm/mmu.c === --- kvm.orig/arch/x86/kvm/mmu.c +++ kvm/arch/x86/kvm/mmu.c @@ -1273,6 +1273,11 @@ static bool shadow_walk_okay(struct kvm_ { if (iterator-level PT_PAGE_TABLE_LEVEL) return false; + +if (iterator-level == PT_PAGE_TABLE_LEVEL) +if (is_large_pte(*iterator-sptep)) +return false; s/==//? Ah, it's actually fine. But changing == to = will make it 1GBpage-ready. Humpf, better check level explicitly before interpreting bit 7, so lets skip this for 1GB pages. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 4/5] KVM: VMX: EPT misconfiguration handler
Handler for EPT misconfiguration which checks for valid state in the shadow pagetables, printing the spte on each level. The separate WARN_ONs are useful for kerneloops.org. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com Index: kvm/arch/x86/kvm/vmx.c === --- kvm.orig/arch/x86/kvm/vmx.c +++ kvm/arch/x86/kvm/vmx.c @@ -3233,6 +3233,90 @@ static int handle_ept_violation(struct k return kvm_mmu_page_fault(vcpu, gpa PAGE_MASK, 0); } +static u64 ept_rsvd_mask(u64 *sptep, int level) +{ + int i; + u64 mask = 0; + + for (i = 51; i boot_cpu_data.x86_phys_bits; i--) + mask |= (1ULL i); + + if (level 2) + /* bits 7:3 reserved */ + mask |= 0xf8; + else if (level == 2) { + if (*sptep (1ULL 7)) + /* 2MB ref, bits 20:12 reserved */ + mask |= 0x1ff000; + else + /* bits 6:3 reserved */ + mask |= 0x78; + } + + return mask; +} + +static void ept_misconfig_inspect_spte(struct kvm_vcpu *vcpu, u64 *sptep, + int level) +{ + printk(KERN_ERR %s: sptep %p spte 0x%llx level %d\n, + __func__, sptep, *sptep, level); + + /* 010b (write-only) */ + WARN_ON((*sptep 0x7) == 0x2); + + /* 110b (write/execute) */ + WARN_ON((*sptep 0x7) == 0x6); + + /* 100b (execute-only) and value not supported by logical processor */ + if (!cpu_has_vmx_ept_execute_only()) + WARN_ON((*sptep 0x7) == 0x4); + + /* not 000b */ + if ((*sptep 0x7)) { + u64 rsvd_bits = *sptep ept_rsvd_mask(sptep, level); + + if (rsvd_bits != 0) { + printk(KERN_ERR %s: rsvd_bits = 0x%llx\n, +__func__, rsvd_bits); + WARN_ON(1); + } + + if (level == 1 || (level == 2 (*sptep (1ULL 7 { + u64 ept_mem_type = (*sptep 0x38) 3; + + if (ept_mem_type == 2 || ept_mem_type == 3 || + ept_mem_type == 7) { + printk(KERN_ERR %s: ept_mem_type=0x%llx\n, + __func__, ept_mem_type); + WARN_ON(1); + } + } + } +} + +static int handle_ept_misconfig(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) +{ + u64 *sptes[4]; + int nr_sptes, i; + gpa_t gpa; + + gpa = vmcs_read64(GUEST_PHYSICAL_ADDRESS); + + printk(KERN_ERR EPT: Misconfiguration.\n); + printk(KERN_ERR EPT: GPA: 0x%llx\n, gpa); + + nr_sptes = kvm_mmu_get_spte_hierarchy(vcpu, gpa, sptes); + + for (i = PT64_ROOT_LEVEL; i PT64_ROOT_LEVEL - nr_sptes; --i) + ept_misconfig_inspect_spte(vcpu, sptes[i-1], i); + + kvm_run-exit_reason = KVM_EXIT_UNKNOWN; + kvm_run-hw.hardware_exit_reason = EXIT_REASON_EPT_MISCONFIG; + + return 0; +} + static int handle_nmi_window(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) { u32 cpu_based_vm_exec_control; @@ -3303,8 +3387,9 @@ static int (*kvm_vmx_exit_handlers[])(st [EXIT_REASON_APIC_ACCESS] = handle_apic_access, [EXIT_REASON_WBINVD] = handle_wbinvd, [EXIT_REASON_TASK_SWITCH] = handle_task_switch, - [EXIT_REASON_EPT_VIOLATION] = handle_ept_violation, [EXIT_REASON_MCE_DURING_VMENTRY] = handle_machine_check, + [EXIT_REASON_EPT_VIOLATION] = handle_ept_violation, + [EXIT_REASON_EPT_MISCONFIG] = handle_ept_misconfig, }; static const int kvm_vmx_max_exit_handlers = -- -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 1/5] KVM: VMX: more MSR_IA32_VMX_EPT_VPID_CAP capability bits
Required for EPT misconfiguration handler. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com Index: kvm/arch/x86/include/asm/vmx.h === --- kvm.orig/arch/x86/include/asm/vmx.h +++ kvm/arch/x86/include/asm/vmx.h @@ -352,9 +352,16 @@ enum vmcs_field { #define VMX_EPT_EXTENT_INDIVIDUAL_ADDR 0 #define VMX_EPT_EXTENT_CONTEXT 1 #define VMX_EPT_EXTENT_GLOBAL 2 + +#define VMX_EPT_EXECUTE_ONLY_BIT (1ull) +#define VMX_EPT_PAGE_WALK_4_BIT(1ull 6) +#define VMX_EPTP_UC_BIT(1ull 8) +#define VMX_EPTP_WB_BIT(1ull 14) +#define VMX_EPT_2MB_PAGE_BIT (1ull 16) #define VMX_EPT_EXTENT_INDIVIDUAL_BIT (1ull 24) #define VMX_EPT_EXTENT_CONTEXT_BIT (1ull 25) #define VMX_EPT_EXTENT_GLOBAL_BIT (1ull 26) + #define VMX_EPT_DEFAULT_GAW3 #define VMX_EPT_MAX_GAW0x4 #define VMX_EPT_MT_EPTE_SHIFT 3 Index: kvm/arch/x86/kvm/vmx.c === --- kvm.orig/arch/x86/kvm/vmx.c +++ kvm/arch/x86/kvm/vmx.c @@ -270,6 +270,26 @@ static inline bool cpu_has_vmx_flexprior cpu_has_vmx_virtualize_apic_accesses(); } +static inline bool cpu_has_vmx_ept_execute_only(void) +{ + return !!(vmx_capability.ept VMX_EPT_EXECUTE_ONLY_BIT); +} + +static inline bool cpu_has_vmx_eptp_uncacheable(void) +{ + return !!(vmx_capability.ept VMX_EPTP_UC_BIT); +} + +static inline bool cpu_has_vmx_eptp_writeback(void) +{ + return !!(vmx_capability.ept VMX_EPTP_WB_BIT); +} + +static inline bool cpu_has_vmx_ept_2m_page(void) +{ + return !!(vmx_capability.ept VMX_EPT_2MB_PAGE_BIT); +} + static inline int cpu_has_vmx_invept_individual_addr(void) { return !!(vmx_capability.ept VMX_EPT_EXTENT_INDIVIDUAL_BIT); -- -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 2/5] KVM: MMU: make for_each_shadow_entry aware of largepages
This way there is no need to add explicit checks in every for_each_shadow_entry user. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com Index: kvm/arch/x86/kvm/mmu.c === --- kvm.orig/arch/x86/kvm/mmu.c +++ kvm/arch/x86/kvm/mmu.c @@ -1273,6 +1273,11 @@ static bool shadow_walk_okay(struct kvm_ { if (iterator-level PT_PAGE_TABLE_LEVEL) return false; + + if (iterator-level == PT_PAGE_TABLE_LEVEL) + if (is_large_pte(*iterator-sptep)) + return false; + iterator-index = SHADOW_PT_INDEX(iterator-addr, iterator-level); iterator-sptep = ((u64 *)__va(iterator-shadow_addr)) + iterator-index; return true; -- -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 0/5] VMX EPT misconfiguration handler v2
Addressing comments. -- -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] kvm: change the dirty page tracking to work with dirty bity
On Thu, Jun 11, 2009 at 02:27:46PM +0300, Izik Eidus wrote: Marcelo Tosatti wrote: What i'm saying is with shadow and NPT (i believe) you can mark a spte writable but not dirty, which gives you the ability to know whether certain pages have been dirtied. Isnt this what this patch is doing? Yes, was confused for some reason i don't remember. So making the dirty bit available to the host is a good idea, but would have to check things like faults on out of sync pagetables (where the guest dirty bit might be cleared in parallel, maybe its ok but not sure), verify transfer of dirty bit when zapping is consistent everywhere, etc. So it would be nicer to introduce an optimization to the way dirty bit info is acquired, then you use that to optimize kvm's dirty log ioctl. The link with KSM was that you can consult this dirty info, which is fast, to know if content of pages has changed. But it maybe useless, don't know. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[ kvm-Bugs-2353510 ] Fedora 10 and F11 failures
Bugs item #2353510, was opened at 2008-11-27 14:46 Message generated for change (Settings changed) made by technologov You can respond by visiting: https://sourceforge.net/tracker/?func=detailatid=893831aid=2353510group_id=180599 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 9 Private: No Submitted By: Technologov (technologov) Assigned to: Nobody/Anonymous (nobody) Summary: Fedora 10 and F11 failures Initial Comment: Description: Fedora 10 fails to install on KVM. (KVM-79) The DVD version stucks at the near end setup stage, when trying to install GRUB bootloader into HDD. It didn't proceed within one hour, which indicates stucked VM. Sometimes it may stuck earlier - during init or during early setup. Live CD (32-bit) started fine on both Intel and AMD. (except top menu minor rendering bug) Guest(s): Fedora 10 64-bit Guest(s): Fedora 10 32-bit Host(s): Fedora 7 64-bit, Intel, KVM-79 Host(s): Fedora 7 64-bit, AMD, KVM-79 Command: (for DVD) qemu-kvm -cdrom /isos/linux/Fedora-10-x86_64-DVD.iso -m 512 -hda /vm/f10-64.qcow2 -boot d *and* (for LiveCD) qemu-kvm -cdrom /isos/linux/F10-i686-Live.iso -m 512 -Alexey, 27.11.2008. -- Comment By: Technologov (technologov) Date: 2009-06-11 17:18 Message: Not only Fedora 10, but also Fedora 11 fails in the same way. Raising bug priority. Guest(s): Fedora 10 64-bit DVD Tested on KVM-86, Intel CPU. -- Comment By: Technologov (technologov) Date: 2008-12-02 12:39 Message: I have opened similar bug against Fedora 10 bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=474116 -Alexey -- You can respond by visiting: https://sourceforge.net/tracker/?func=detailatid=893831aid=2353510group_id=180599 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/6] mmu audit update v4
Marcelo Tosatti wrote: Addressing comments, introducing a new helper, handling largepages. Applied, thanks. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] move libkvm-all.c code to qemu-kvm.c
Glauber Costa wrote: Ultimately, goal is to put it in kvm-all.c, so we can start sharing things. This is put here first to allow for preparation. It is almost a cut and paste. Only needed adaptation goes with kvm_has_sync_mmu(), which had a conflicting definition. Applied, thanks. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 3/5] KVM: MMU: add kvm_mmu_get_spte_hierarchy helper
Marcelo Tosatti wrote: Required by EPT misconfiguration handler. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com Index: kvm/arch/x86/kvm/mmu.c === --- kvm.orig/arch/x86/kvm/mmu.c +++ kvm/arch/x86/kvm/mmu.c @@ -3013,6 +3013,24 @@ out: return r; } +int kvm_mmu_get_spte_hierarchy(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes[4]) +{ + struct kvm_shadow_walk_iterator iterator; + int nr_sptes = 0; + + spin_lock(vcpu-kvm-mmu_lock); + for_each_shadow_entry(vcpu, addr, iterator) { + sptes[iterator.level-1] = iterator.sptep; Returning a pointer... + nr_sptes++; + if (!is_shadow_present_pte(*iterator.sptep)) + break; + } + spin_unlock(vcpu-kvm-mmu_lock); ... and unlocking the lock that protects it. True, this is called in extreme cases, but I think you can dereference the pointer in the function just as easily. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/4] cleanup mmio coalescing functions
remove wrappers that existed only due to qemu/libkvm separation. Use qemu types for function definitions. Signed-off-by: Glauber Costa glom...@redhat.com --- qemu-kvm.c | 27 --- qemu-kvm.h |5 - 2 files changed, 4 insertions(+), 28 deletions(-) diff --git a/qemu-kvm.c b/qemu-kvm.c index 2c2d46f..7b25d9e 100644 --- a/qemu-kvm.c +++ b/qemu-kvm.c @@ -1099,9 +1099,10 @@ int kvm_init_coalesced_mmio(kvm_context_t kvm) return r; } -int kvm_register_coalesced_mmio(kvm_context_t kvm, uint64_t addr, uint32_t size) +int kvm_coalesce_mmio_region(target_phys_addr_t addr, ram_addr_t size) { #ifdef KVM_CAP_COALESCED_MMIO + kvm_context_t kvm = kvm_context; struct kvm_coalesced_mmio_zone zone; int r; @@ -1121,9 +1122,10 @@ int kvm_register_coalesced_mmio(kvm_context_t kvm, uint64_t addr, uint32_t size) return -ENOSYS; } -int kvm_unregister_coalesced_mmio(kvm_context_t kvm, uint64_t addr, uint32_t size) +int kvm_uncoalesce_mmio_region(target_phys_addr_t addr, ram_addr_t size) { #ifdef KVM_CAP_COALESCED_MMIO + kvm_context_t kvm = kvm_context; struct kvm_coalesced_mmio_zone zone; int r; @@ -2773,27 +2775,6 @@ void kvm_mutex_lock(void) cpu_single_env = NULL; } -int qemu_kvm_register_coalesced_mmio(target_phys_addr_t addr, unsigned int size) -{ -return kvm_register_coalesced_mmio(kvm_context, addr, size); -} - -int qemu_kvm_unregister_coalesced_mmio(target_phys_addr_t addr, - unsigned int size) -{ -return kvm_unregister_coalesced_mmio(kvm_context, addr, size); -} - -int kvm_coalesce_mmio_region(target_phys_addr_t start, ram_addr_t size) -{ -return kvm_register_coalesced_mmio(kvm_context, start, size); -} - -int kvm_uncoalesce_mmio_region(target_phys_addr_t start, ram_addr_t size) -{ -return kvm_unregister_coalesced_mmio(kvm_context, start, size); -} - #ifdef USE_KVM_DEVICE_ASSIGNMENT void kvm_add_ioperm_data(struct ioperm_data *data) { diff --git a/qemu-kvm.h b/qemu-kvm.h index 0dfbcd1..4db1763 100644 --- a/qemu-kvm.h +++ b/qemu-kvm.h @@ -111,11 +111,6 @@ void kvm_tpr_access_report(CPUState *env, uint64_t rip, int is_write); void kvm_tpr_vcpu_start(CPUState *env); int qemu_kvm_get_dirty_pages(unsigned long phys_addr, void *buf); -int qemu_kvm_register_coalesced_mmio(target_phys_addr_t addr, -unsigned int size); -int qemu_kvm_unregister_coalesced_mmio(target_phys_addr_t addr, - unsigned int size); - int kvm_coalesce_mmio_region(target_phys_addr_t start, ram_addr_t size); int kvm_uncoalesce_mmio_region(target_phys_addr_t start, ram_addr_t size); -- 1.5.6.6 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/4] remove callbacks structure
The purpose of that was only to allow the user of libkvm to register functions pointers that corresponded to possible actions. We don't need that anymore. Signed-off-by: Glauber Costa glom...@redhat.com --- libkvm-all.h |4 +- qemu-kvm.c | 380 +++--- 2 files changed, 175 insertions(+), 209 deletions(-) diff --git a/libkvm-all.h b/libkvm-all.h index 4f7b9a3..be8c855 100644 --- a/libkvm-all.h +++ b/libkvm-all.h @@ -177,12 +177,10 @@ struct kvm_callbacks { * holds information about the KVM instance that gets created by this call.\n * This should always be your first call to KVM. * - * \param callbacks Pointer to a valid kvm_callbacks structure * \param opaque Not used * \return NULL on failure */ -kvm_context_t kvm_init(struct kvm_callbacks *callbacks, - void *opaque); +kvm_context_t kvm_init(void *opaque); /*! * \brief Cleanup the KVM context diff --git a/qemu-kvm.c b/qemu-kvm.c index 7b25d9e..7a0fb83 100644 --- a/qemu-kvm.c +++ b/qemu-kvm.c @@ -10,6 +10,7 @@ #include assert.h #include string.h +#include signal.h #include hw/hw.h #include sysemu.h #include qemu-common.h @@ -192,6 +193,156 @@ int kvm_is_containing_region(kvm_context_t kvm, unsigned long phys_addr, unsigne return 1; } +#ifdef KVM_CAP_SET_GUEST_DEBUG +static int kvm_debug(void *opaque, void *data, + struct kvm_debug_exit_arch *arch_info) +{ +int handle = kvm_arch_debug(arch_info); +CPUState *env = data; + +if (handle) { + kvm_debug_cpu_requested = env; + env-kvm_cpu_state.stopped = 1; +} +return handle; +} +#endif + +static int kvm_inb(void *opaque, uint16_t addr, uint8_t *data) +{ +*data = cpu_inb(0, addr); +return 0; +} + +static int kvm_inw(void *opaque, uint16_t addr, uint16_t *data) +{ +*data = cpu_inw(0, addr); +return 0; +} + +static int kvm_inl(void *opaque, uint16_t addr, uint32_t *data) +{ +*data = cpu_inl(0, addr); +return 0; +} + +#define PM_IO_BASE 0xb000 + +static int kvm_outb(void *opaque, uint16_t addr, uint8_t data) +{ +if (addr == 0xb2) { + switch (data) { + case 0: { + cpu_outb(0, 0xb3, 0); + break; + } + case 0xf0: { + unsigned x; + + /* enable acpi */ + x = cpu_inw(0, PM_IO_BASE + 4); + x = ~1; + cpu_outw(0, PM_IO_BASE + 4, x); + break; + } + case 0xf1: { + unsigned x; + + /* enable acpi */ + x = cpu_inw(0, PM_IO_BASE + 4); + x |= 1; + cpu_outw(0, PM_IO_BASE + 4, x); + break; + } + default: + break; + } + return 0; +} +cpu_outb(0, addr, data); +return 0; +} + +static int kvm_outw(void *opaque, uint16_t addr, uint16_t data) +{ +cpu_outw(0, addr, data); +return 0; +} + +static int kvm_outl(void *opaque, uint16_t addr, uint32_t data) +{ +cpu_outl(0, addr, data); +return 0; +} + +static int kvm_mmio_read(void *opaque, uint64_t addr, uint8_t *data, int len) +{ + cpu_physical_memory_rw(addr, data, len, 0); + return 0; +} + +static int kvm_mmio_write(void *opaque, uint64_t addr, uint8_t *data, int len) +{ + cpu_physical_memory_rw(addr, data, len, 1); + return 0; +} + +static int kvm_io_window(void *opaque) +{ +return 1; +} + +static int kvm_halt(void *opaque, kvm_vcpu_context_t vcpu) +{ +return kvm_arch_halt(opaque, vcpu); +} + +static int kvm_shutdown(void *opaque, void *data) +{ +CPUState *env = (CPUState *)data; + +/* stop the current vcpu from going back to guest mode */ +env-kvm_cpu_state.stopped = 1; + +qemu_system_reset_request(); +return 1; +} + +static int handle_unhandled(kvm_context_t kvm, kvm_vcpu_context_t vcpu, +uint64_t reason) +{ +fprintf(stderr, kvm: unhandled exit %PRIx64\n, reason); +return -EINVAL; +} + + +static int kvm_try_push_interrupts(void *opaque) +{ +return kvm_arch_try_push_interrupts(opaque); +} + +static void kvm_post_run(void *opaque, void *data) +{ +CPUState *env = (CPUState *)data; + +pthread_mutex_lock(qemu_mutex); +kvm_arch_post_kvm_run(opaque, env); +} + +static int kvm_pre_run(void *opaque, void *data) +{ +CPUState *env = (CPUState *)data; + +kvm_arch_pre_kvm_run(opaque, env); + +if (env-exit_request) + return 1; +pthread_mutex_unlock(qemu_mutex); +return 0; +} + + + /* * dirty pages logging control */ @@ -314,8 +465,7 @@ int kvm_dirty_pages_log_reset(kvm_context_t kvm) } -kvm_context_t kvm_init(struct kvm_callbacks *callbacks, - void *opaque) +kvm_context_t kvm_init(void *opaque) { int fd; kvm_context_t kvm; @@ -351,7 +501,6 @@ kvm_context_t kvm_init(struct kvm_callbacks *callbacks, memset(kvm, 0, sizeof(*kvm)); kvm-fd = fd; kvm-vm_fd = -1; - kvm-callbacks
Re: [PATCH 0/4] qemu-kvm cleanup
Glauber Costa wrote: Same series as before, but with avi's little comment addressed. Applied, thanks. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/5] BIOS changes for configuring irq0-inti2 override (v4)
These patches resolve the irq0-inti2 override issue, and get the hpet working on kvm. Override and HPET changes are sent as a series because HPET depends on the override. Win2k8 expects the HPET interrupt on inti2, regardless of whether an override exists in the BIOS. And the HPET spec states that in legacy mode, timer interrupt is on inti2. The irq0-inti2 override will always be used unless the kernel cannot do irq routing (i.e., compatibility with old kernels). So if the kernel is capable, userspace sets up irq0-inti2 via the irq routing interface, and adds the irq0-inti2 override to the MADT interrupt source override table, and the mp table (for the no-acpi case). Changes from v3: - changes based on comments from Avi and Gleb. - corrected legacy enable/disable for in-kernel PIT. The code now best approximates a multiplexer that disables PIT interrupts when HPET is in legacy mode (as described by HPET spec). Any changes to the PIT that may occur while HPET is operating in legacy mode are saved, so if HPET leaves legacy mode, the PIT is just reenabled, with mode set to whatever the last setting from guest was. Legacy mode is disabled at least during crash and shutdown (in Linux), so this needs to be handled properly. --- kvm/bios/rombios32.c | 60 - 1 files changed, 44 insertions(+), 16 deletions(-) diff --git a/kvm/bios/rombios32.c b/kvm/bios/rombios32.c index 369cbef..9d6910e 100755 --- a/kvm/bios/rombios32.c +++ b/kvm/bios/rombios32.c @@ -444,6 +444,9 @@ uint32_t cpuid_features; uint32_t cpuid_ext_features; unsigned long ram_size; uint64_t ram_end; +#ifdef BX_QEMU +uint8_t irq0_override; +#endif #ifdef BX_USE_EBDA_TABLES unsigned long ebda_cur_addr; #endif @@ -485,6 +488,7 @@ void wrmsr_smp(uint32_t index, uint64_t val) #define QEMU_CFG_ARCH_LOCAL 0x8000 #define QEMU_CFG_ACPI_TABLES (QEMU_CFG_ARCH_LOCAL + 0) #define QEMU_CFG_SMBIOS_ENTRIES (QEMU_CFG_ARCH_LOCAL + 1) +#define QEMU_CFG_IRQ0_OVERRIDE (QEMU_CFG_ARCH_LOCAL + 2) int qemu_cfg_port; @@ -553,6 +557,17 @@ uint64_t qemu_cfg_get64 (void) } #endif +#ifdef BX_QEMU +void irq0_override_probe(void) +{ +if(qemu_cfg_port) { +qemu_cfg_select(QEMU_CFG_IRQ0_OVERRIDE); +qemu_cfg_read(irq0_override, 1); +return; +} +} +#endif + void cpu_probe(void) { uint32_t eax, ebx, ecx, edx; @@ -1195,6 +1210,13 @@ static void mptable_init(void) /* irqs */ for(i = 0; i 16; i++) { +#ifdef BX_QEMU +/* One entry per ioapic interrupt destination. Destination 2 is covered + * by irq0-inti2 override (i == 0). Source IRQ 2 is unused + */ +if (irq0_override i == 2) +continue; +#endif putb(q, 3); /* entry type = I/O interrupt */ putb(q, 0); /* interrupt type = vectored interrupt */ putb(q, 0); /* flags: po=0, el=0 */ @@ -1202,7 +1224,12 @@ static void mptable_init(void) putb(q, 0); /* source bus ID = ISA */ putb(q, i); /* source bus IRQ */ putb(q, ioapic_id); /* dest I/O APIC ID */ -putb(q, i); /* dest I/O APIC interrupt in */ +#ifdef BX_QEMU +if (irq0_override i == 0) +putb(q, 2); /* dest I/O APIC interrupt in */ +else +#endif +putb(q, i); /* dest I/O APIC interrupt in */ } /* patch length */ len = q - mp_config_table; @@ -1758,23 +1785,21 @@ void acpi_bios_init(void) io_apic-io_apic_id = smp_cpus; io_apic-address = cpu_to_le32(0xfec0); io_apic-interrupt = cpu_to_le32(0); -#ifdef BX_QEMU -#ifdef HPET_WORKS_IN_KVM io_apic++; - -int_override = (void *)io_apic; -int_override-type = APIC_XRUPT_OVERRIDE; -int_override-length = sizeof(*int_override); -int_override-bus = cpu_to_le32(0); -int_override-source = cpu_to_le32(0); -int_override-gsi = cpu_to_le32(2); -int_override-flags = cpu_to_le32(0); -#endif +int_override = (struct madt_int_override*)(io_apic); +#ifdef BX_QEMU +if (irq0_override) { +memset(int_override, 0, sizeof(*int_override)); +int_override-type = APIC_XRUPT_OVERRIDE; +int_override-length = sizeof(*int_override); +int_override-source = 0; +int_override-gsi = 2; +int_override-flags = 0; /* conforms to bus specifications */ +int_override++; +} #endif - -int_override = (struct madt_int_override*)(io_apic + 1); -for ( i = 0; i 16; i++ ) { -if ( PCI_ISA_IRQ_MASK (1U i) ) { +for (i = 0; i 16; i++) { +if (PCI_ISA_IRQ_MASK (1U i)) { memset(int_override, 0, sizeof(*int_override)); int_override-type = APIC_XRUPT_OVERRIDE; int_override-length = sizeof(*int_override); @@ -2697,6 +2722,9 @@ void rombios32_init(uint32_t *s3_resume_vector, uint8_t *shutdown_flag)
[PATCH 3/5] BIOS changes for KVM HPET (v5)
Signed-off-by: Beth Kon e...@us.ibm.com --- kvm/bios/acpi-dsdt.dsl |2 -- kvm/bios/rombios32.c | 11 +++ 2 files changed, 3 insertions(+), 10 deletions(-) diff --git a/kvm/bios/acpi-dsdt.dsl b/kvm/bios/acpi-dsdt.dsl index db57307..71d0a5e 100755 --- a/kvm/bios/acpi-dsdt.dsl +++ b/kvm/bios/acpi-dsdt.dsl @@ -296,7 +296,6 @@ DefinitionBlock ( }) } #ifdef BX_QEMU -#ifdef HPET_WORKS_IN_KVM Device(HPET) { Name(_HID, EISAID(PNP0103)) Name(_UID, 0) @@ -316,7 +315,6 @@ DefinitionBlock ( }) } #endif -#endif } Scope(\_SB.PCI0) { diff --git a/kvm/bios/rombios32.c b/kvm/bios/rombios32.c index 9d6910e..1106f38 100755 --- a/kvm/bios/rombios32.c +++ b/kvm/bios/rombios32.c @@ -1518,8 +1518,8 @@ struct acpi_20_generic_address { } __attribute__((__packed__)); /* - * * HPET Description Table - * */ + * HPET Description Table + */ struct acpi_20_hpet { ACPI_TABLE_HEADER_DEF /* ACPI common table header */ uint32_t timer_block_id; @@ -1703,13 +1703,11 @@ void acpi_bios_init(void) addr += madt_size; #ifdef BX_QEMU -#ifdef HPET_WORKS_IN_KVM addr = (addr + 7) ~7; hpet_addr = addr; hpet = (void *)(addr); addr += sizeof(*hpet); #endif -#endif /* RSDP */ memset(rsdp, 0, sizeof(*rsdp)); @@ -1883,7 +1881,6 @@ void acpi_bios_init(void) } /* HPET */ -#ifdef HPET_WORKS_IN_KVM memset(hpet, 0, sizeof(*hpet)); /* Note timer_block_id value must be kept in sync with value advertised by * emulated hpet @@ -1892,7 +1889,6 @@ void acpi_bios_init(void) hpet-addr.address = cpu_to_le32(ACPI_HPET_ADDRESS); acpi_build_table_header((struct acpi_table_header *)hpet, HPET, sizeof(*hpet), 1); -#endif acpi_additional_tables(); /* resets cfg to required entry */ for(i = 0; i external_tables; i++) { @@ -1912,8 +1908,7 @@ void acpi_bios_init(void) /* kvm has no ssdt (processors are in dsdt) */ // rsdt-table_offset_entry[nb_rsdt_entries++] = cpu_to_le32(ssdt_addr); #ifdef BX_QEMU -/* No HPET (yet) */ -// rsdt-table_offset_entry[nb_rsdt_entries++] = cpu_to_le32(hpet_addr); +rsdt-table_offset_entry[nb_rsdt_entries++] = cpu_to_le32(hpet_addr); if (nb_numa_nodes 0) rsdt-table_offset_entry[nb_rsdt_entries++] = cpu_to_le32(srat_addr); #endif -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 5/5] HPET interaction with in-kernel PIT
Signed-off-by: Beth Kon e...@us.ibm.com --- arch/x86/include/asm/kvm.h |1 + arch/x86/kvm/i8254.c | 24 +++- arch/x86/kvm/i8254.h |3 ++- arch/x86/kvm/x86.c |5 - 4 files changed, 26 insertions(+), 7 deletions(-) diff --git a/arch/x86/include/asm/kvm.h b/arch/x86/include/asm/kvm.h index 708b9c3..3c44923 100644 --- a/arch/x86/include/asm/kvm.h +++ b/arch/x86/include/asm/kvm.h @@ -235,6 +235,7 @@ struct kvm_guest_debug_arch { struct kvm_pit_state { struct kvm_pit_channel_state channels[3]; + u8 hpet_legacy_mode; }; struct kvm_reinject_control { diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c index 331705f..bb8382b 100644 --- a/arch/x86/kvm/i8254.c +++ b/arch/x86/kvm/i8254.c @@ -340,10 +340,20 @@ static void pit_load_count(struct kvm *kvm, int channel, u32 val) } } -void kvm_pit_load_count(struct kvm *kvm, int channel, u32 val) +void kvm_pit_load_count(struct kvm *kvm, int channel, u32 val, int hpet_legacy_start) { + u8 saved_mode; mutex_lock(kvm-arch.vpit-pit_state.lock); - pit_load_count(kvm, channel, val); + if (hpet_legacy_start) { + /* save existing mode for later reenablement */ + saved_mode = kvm-arch.vpit-pit_state.channels[0].mode; + kvm-arch.vpit-pit_state.channels[0].mode = 0xff; /* disable timer */ + pit_load_count(kvm, channel, val); + kvm-arch.vpit-pit_state.channels[0].mode = saved_mode; + } else { + if (!(channel == 0 kvm-arch.vpit-pit_state.hpet_legacy_mode)) + pit_load_count(kvm, channel, val); + } mutex_unlock(kvm-arch.vpit-pit_state.lock); } @@ -411,17 +421,20 @@ static void pit_ioport_write(struct kvm_io_device *this, switch (s-write_state) { default: case RW_STATE_LSB: - pit_load_count(kvm, addr, val); + if (!(addr == 0 pit_state-hpet_legacy_mode)) + pit_load_count(kvm, addr, val); break; case RW_STATE_MSB: - pit_load_count(kvm, addr, val 8); + if (!(addr == 0 pit_state-hpet_legacy_mode)) + pit_load_count(kvm, addr, val 8); break; case RW_STATE_WORD0: s-write_latch = val; s-write_state = RW_STATE_WORD1; break; case RW_STATE_WORD1: - pit_load_count(kvm, addr, s-write_latch | (val 8)); + if (!(addr == 0 pit_state-hpet_legacy_mode)) + pit_load_count(kvm, addr, s-write_latch | (val 8)); s-write_state = RW_STATE_WORD0; break; } @@ -548,6 +561,7 @@ void kvm_pit_reset(struct kvm_pit *pit) struct kvm_kpit_channel_state *c; mutex_lock(pit-pit_state.lock); + pit-pit_state.hpet_legacy_mode = 0; for (i = 0; i 3; i++) { c = pit-pit_state.channels[i]; c-mode = 0xff; diff --git a/arch/x86/kvm/i8254.h b/arch/x86/kvm/i8254.h index b267018..b5967ca 100644 --- a/arch/x86/kvm/i8254.h +++ b/arch/x86/kvm/i8254.h @@ -21,6 +21,7 @@ struct kvm_kpit_channel_state { struct kvm_kpit_state { struct kvm_kpit_channel_state channels[3]; + u8 hpet_legacy_mode; struct kvm_timer pit_timer; bool is_periodic; u32speaker_data_on; @@ -49,7 +50,7 @@ struct kvm_pit { #define KVM_PIT_CHANNEL_MASK 0x3 void kvm_inject_pit_timer_irqs(struct kvm_vcpu *vcpu); -void kvm_pit_load_count(struct kvm *kvm, int channel, u32 val); +void kvm_pit_load_count(struct kvm *kvm, int channel, u32 val, int hpet_legacy_start); struct kvm_pit *kvm_create_pit(struct kvm *kvm, u32 flags); void kvm_free_pit(struct kvm *kvm); void kvm_pit_reset(struct kvm_pit *pit); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 1b91ea7..3c70545 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1948,9 +1948,12 @@ static int kvm_vm_ioctl_get_pit(struct kvm *kvm, struct kvm_pit_state *ps) static int kvm_vm_ioctl_set_pit(struct kvm *kvm, struct kvm_pit_state *ps) { int r = 0; + int hpet_legacy_start = 0; + if (ps-hpet_legacy_mode !kvm-arch.vpit-pit_state.hpet_legacy_mode) + hpet_legacy_start = 1; memcpy(kvm-arch.vpit-pit_state, ps, sizeof(struct kvm_pit_state)); - kvm_pit_load_count(kvm, 0, ps-channels[0].count); + kvm_pit_load_count(kvm, 0, ps-channels[0].count, hpet_legacy_start); return r; } -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/5] Userspace changes for configuring irq0-inti2 override (v4)
Signed-off-by: Beth Kon e...@us.ibm.com --- hw/ioapic.c|6 +++--- hw/pc.c|2 ++ qemu-kvm-x86.c |6 +- qemu-kvm.h |2 ++ sysemu.h |1 + vl.c | 11 +-- 6 files changed, 22 insertions(+), 6 deletions(-) diff --git a/hw/ioapic.c b/hw/ioapic.c index 6c178c7..a67b766 100644 --- a/hw/ioapic.c +++ b/hw/ioapic.c @@ -23,6 +23,7 @@ #include hw.h #include pc.h +#include sysemu.h #include qemu-timer.h #include host-utils.h @@ -95,14 +96,13 @@ void ioapic_set_irq(void *opaque, int vector, int level) { IOAPICState *s = opaque; -#if 0 /* ISA IRQs map to GSI 1-1 except for IRQ0 which maps * to GSI 2. GSI maps to ioapic 1-1. This is not * the cleanest way of doing it but it should work. */ -if (vector == 0) +if (vector == 0 irq0override) { vector = 2; -#endif +} if (vector = 0 vector IOAPIC_NUM_PINS) { uint32_t mask = 1 vector; diff --git a/hw/pc.c b/hw/pc.c index 66f4635..1c068fb 100644 --- a/hw/pc.c +++ b/hw/pc.c @@ -55,6 +55,7 @@ #define BIOS_CFG_IOPORT 0x510 #define FW_CFG_ACPI_TABLES (FW_CFG_ARCH_LOCAL + 0) #define FW_CFG_SMBIOS_ENTRIES (FW_CFG_ARCH_LOCAL + 1) +#define FW_CFG_IRQ0_OVERRIDE (FW_CFG_ARCH_LOCAL + 2) #define MAX_IDE_BUS 2 @@ -476,6 +477,7 @@ static void bochs_bios_init(void) fw_cfg_add_i64(fw_cfg, FW_CFG_RAM_SIZE, (uint64_t)ram_size); fw_cfg_add_bytes(fw_cfg, FW_CFG_ACPI_TABLES, (uint8_t *)acpi_tables, acpi_tables_len); +fw_cfg_add_bytes(fw_cfg, FW_CFG_IRQ0_OVERRIDE, irq0override, 1); smbios_table = smbios_get_table(smbios_len); if (smbios_table) diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c index 5526d8f..89337e9 100644 --- a/qemu-kvm-x86.c +++ b/qemu-kvm-x86.c @@ -909,7 +909,11 @@ int kvm_arch_init_irq_routing(void) return r; } for (i = 0; i 24; ++i) { -r = kvm_add_irq_route(kvm_context, i, KVM_IRQCHIP_IOAPIC, i); +if (i == 0) { +r = kvm_add_irq_route(kvm_context, i, KVM_IRQCHIP_IOAPIC, 2); +} else if (i != 2) { +r = kvm_add_irq_route(kvm_context, i, KVM_IRQCHIP_IOAPIC, i); +} if (r 0) return r; } diff --git a/qemu-kvm.h b/qemu-kvm.h index fa40542..6bbafbc 100644 --- a/qemu-kvm.h +++ b/qemu-kvm.h @@ -169,6 +169,7 @@ int handle_tpr_access(void *opaque, kvm_vcpu_context_t vcpu, #define kvm_enabled() (kvm_allowed) #define qemu_kvm_irqchip_in_kernel() kvm_irqchip_in_kernel(kvm_context) #define qemu_kvm_pit_in_kernel() kvm_pit_in_kernel(kvm_context) +#define qemu_kvm_has_gsi_routing() kvm_has_gsi_routing(kvm_context) #define kvm_has_sync_mmu() qemu_kvm_has_sync_mmu() void kvm_init_vcpu(CPUState *env); void kvm_load_tsc(CPUState *env); @@ -177,6 +178,7 @@ void kvm_load_tsc(CPUState *env); #define kvm_nested 0 #define qemu_kvm_irqchip_in_kernel() (0) #define qemu_kvm_pit_in_kernel() (0) +#define qemu_kvm_has_gsi_routing() (0) #define kvm_has_sync_mmu() (0) #define kvm_load_registers(env) do {} while(0) #define kvm_save_registers(env) do {} while(0) diff --git a/sysemu.h b/sysemu.h index 47d001e..f78e974 100644 --- a/sysemu.h +++ b/sysemu.h @@ -108,6 +108,7 @@ extern int xenfb_enabled; extern int graphic_width; extern int graphic_height; extern int graphic_depth; +extern uint8_t irq0override; extern DisplayType display_type; extern const char *keyboard_layout; extern int win2k_install_hack; diff --git a/vl.c b/vl.c index 2fda17b..9b1d1ab 100644 --- a/vl.c +++ b/vl.c @@ -253,6 +253,7 @@ int no_reboot = 0; int no_shutdown = 0; int cursor_hide = 1; int graphic_rotate = 0; +uint8_t irq0override = 1; #ifndef _WIN32 int daemonize = 0; #endif @@ -6054,8 +6055,14 @@ int main(int argc, char **argv, char **envp) module_call_init(MODULE_INIT_DEVICE); -if (kvm_enabled()) - kvm_init_ap(); +if (kvm_enabled()) { + kvm_init_ap(); +#ifdef USE_KVM +if (kvm_irqchip !qemu_kvm_has_gsi_routing()) { +irq0override = 0; +} +#endif +} machine-init(ram_size, boot_devices, kernel_filename, kernel_cmdline, initrd_filename, cpu_model); -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/5] Userspace changes for KVM HPET (v4)
The big change here is handling of enabling/disabling of hpet legacy mode. When hpet enters legacy mode, the spec says that the pit stops generating interrupts. In practice, we want to stop the pit periodic timer from running because it is wasteful in a virtual environment. We also have to worry about the hpet leaving legacy mode (which, at least in linux, happens only during a shutdown or crash). At this point, according to the hpet spec, PIT interrupts need to be reenabled. For us, it means the PIT timer needs to be restarted. This patch handles this situation better than the previous version by coming closer to just disabling PIT interrupts. It allows the PIT state to change if the OS modifies it, even while PIT is disabled, but does not allow a pit timer to start. Then if HPET legacy mode is disabled, whatever the PIT state is at that point, the PIT timer is restarted accordingly. Signed-off-by: Beth Kon e...@us.ibm.com --- hw/hpet.c | 15 +++ hw/i8254.c| 43 ++- hw/i8254.h|2 ++ hw/pc.h |4 ++-- kvm/include/x86/asm/kvm.h |1 + qemu-kvm.c| 20 qemu-kvm.h|3 ++- vl.c |7 ++- 8 files changed, 74 insertions(+), 21 deletions(-) diff --git a/hw/hpet.c b/hw/hpet.c index 29db325..043b92b 100644 --- a/hw/hpet.c +++ b/hw/hpet.c @@ -206,6 +206,9 @@ static int hpet_load(QEMUFile *f, void *opaque, int version_id) qemu_get_timer(f, s-timer[i].qemu_timer); } } +if (hpet_in_legacy_mode()) { +hpet_disable_pit(); +} return 0; } @@ -475,9 +478,11 @@ static void hpet_ram_writel(void *opaque, target_phys_addr_t addr, } /* i8254 and RTC are disabled when HPET is in legacy mode */ if (activating_bit(old_val, new_val, HPET_CFG_LEGACY)) { -hpet_pit_disable(); +hpet_disable_pit(); +dprintf(qemu: hpet disabled pit\n); } else if (deactivating_bit(old_val, new_val, HPET_CFG_LEGACY)) { -hpet_pit_enable(); +hpet_enable_pit(); +dprintf(qemu: hpet enabled pit\n); } break; case HPET_CFG + 4: @@ -554,13 +559,15 @@ static void hpet_reset(void *opaque) { /* 64-bit main counter; 3 timers supported; LegacyReplacementRoute. */ s-capability = 0x8086a201ULL; s-capability |= ((HPET_CLK_PERIOD) 32); -if (count 0) +if (count 0) { /* we don't enable pit when hpet_reset is first called (by hpet_init) * because hpet is taking over for pit here. On subsequent invocations, * hpet_reset is called due to system reset. At this point control must * be returned to pit until SW reenables hpet. */ -hpet_pit_enable(); +hpet_enable_pit(); +dprintf(qemu: hpet enabled pit\n); +} count = 1; } diff --git a/hw/i8254.c b/hw/i8254.c index 2f229f9..8c8076f 100644 --- a/hw/i8254.c +++ b/hw/i8254.c @@ -25,6 +25,7 @@ #include pc.h #include isa.h #include qemu-timer.h +#include qemu-kvm.h #include i8254.h //#define DEBUG_PIT @@ -198,6 +199,9 @@ int pit_get_mode(PITState *pit, int channel) static inline void pit_load_count(PITChannelState *s, int val) { +if (s-channel == 0 pit_state.hpet_legacy_mode) { +return; +} if (val == 0) val = 0x1; s-count_load_time = qemu_get_clock(vm_clock); @@ -371,10 +375,11 @@ static void pit_irq_timer_update(PITChannelState *s, int64_t current_time) (double)(expire_time - current_time) / ticks_per_sec); #endif s-next_transition_time = expire_time; -if (expire_time != -1) +if (expire_time != -1) { qemu_mod_timer(s-irq_timer, expire_time); -else +} else { qemu_del_timer(s-irq_timer); +} } static void pit_irq_timer(void *opaque) @@ -451,6 +456,7 @@ void pit_reset(void *opaque) PITChannelState *s; int i; +pit-hpet_legacy_mode = 0; for(i = 0;i 3; i++) { s = pit-channels[i]; s-mode = 3; @@ -460,32 +466,43 @@ void pit_reset(void *opaque) } /* When HPET is operating in legacy mode, i8254 timer0 is disabled */ -void hpet_pit_disable(void) { -PITChannelState *s; -s = pit_state.channels[0]; -if (s-irq_timer) -qemu_del_timer(s-irq_timer); + +void hpet_disable_pit(void) +{ +PITChannelState *s = pit_state.channels[0]; +if (qemu_kvm_pit_in_kernel()) { +kvm_hpet_disable_kpit(); +} else { +if (s-irq_timer) { +qemu_del_timer(s-irq_timer); +} +} } /* When HPET is reset or leaving legacy mode, it must reenable i8254 * timer 0 */ -void hpet_pit_enable(void) +void hpet_enable_pit(void) {
[patch 1/5] KVM: VMX: more MSR_IA32_VMX_EPT_VPID_CAP capability bits
Required for EPT misconfiguration handler. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com Index: kvm/arch/x86/include/asm/vmx.h === --- kvm.orig/arch/x86/include/asm/vmx.h +++ kvm/arch/x86/include/asm/vmx.h @@ -352,9 +352,16 @@ enum vmcs_field { #define VMX_EPT_EXTENT_INDIVIDUAL_ADDR 0 #define VMX_EPT_EXTENT_CONTEXT 1 #define VMX_EPT_EXTENT_GLOBAL 2 + +#define VMX_EPT_EXECUTE_ONLY_BIT (1ull) +#define VMX_EPT_PAGE_WALK_4_BIT(1ull 6) +#define VMX_EPTP_UC_BIT(1ull 8) +#define VMX_EPTP_WB_BIT(1ull 14) +#define VMX_EPT_2MB_PAGE_BIT (1ull 16) #define VMX_EPT_EXTENT_INDIVIDUAL_BIT (1ull 24) #define VMX_EPT_EXTENT_CONTEXT_BIT (1ull 25) #define VMX_EPT_EXTENT_GLOBAL_BIT (1ull 26) + #define VMX_EPT_DEFAULT_GAW3 #define VMX_EPT_MAX_GAW0x4 #define VMX_EPT_MT_EPTE_SHIFT 3 Index: kvm/arch/x86/kvm/vmx.c === --- kvm.orig/arch/x86/kvm/vmx.c +++ kvm/arch/x86/kvm/vmx.c @@ -270,6 +270,26 @@ static inline bool cpu_has_vmx_flexprior cpu_has_vmx_virtualize_apic_accesses(); } +static inline bool cpu_has_vmx_ept_execute_only(void) +{ + return !!(vmx_capability.ept VMX_EPT_EXECUTE_ONLY_BIT); +} + +static inline bool cpu_has_vmx_eptp_uncacheable(void) +{ + return !!(vmx_capability.ept VMX_EPTP_UC_BIT); +} + +static inline bool cpu_has_vmx_eptp_writeback(void) +{ + return !!(vmx_capability.ept VMX_EPTP_WB_BIT); +} + +static inline bool cpu_has_vmx_ept_2m_page(void) +{ + return !!(vmx_capability.ept VMX_EPT_2MB_PAGE_BIT); +} + static inline int cpu_has_vmx_invept_individual_addr(void) { return !!(vmx_capability.ept VMX_EPT_EXTENT_INDIVIDUAL_BIT); -- -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 4/5] KVM: VMX: EPT misconfiguration handler
Handler for EPT misconfiguration which checks for valid state in the shadow pagetables, printing the spte on each level. The separate WARN_ONs are useful for kerneloops.org. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com Index: kvm/arch/x86/kvm/vmx.c === --- kvm.orig/arch/x86/kvm/vmx.c +++ kvm/arch/x86/kvm/vmx.c @@ -3233,6 +3233,89 @@ static int handle_ept_violation(struct k return kvm_mmu_page_fault(vcpu, gpa PAGE_MASK, 0); } +static u64 ept_rsvd_mask(u64 spte, int level) +{ + int i; + u64 mask = 0; + + for (i = 51; i boot_cpu_data.x86_phys_bits; i--) + mask |= (1ULL i); + + if (level 2) + /* bits 7:3 reserved */ + mask |= 0xf8; + else if (level == 2) { + if (spte (1ULL 7)) + /* 2MB ref, bits 20:12 reserved */ + mask |= 0x1ff000; + else + /* bits 6:3 reserved */ + mask |= 0x78; + } + + return mask; +} + +static void ept_misconfig_inspect_spte(struct kvm_vcpu *vcpu, u64 spte, + int level) +{ + printk(KERN_ERR %s: spte 0x%llx level %d\n, __func__, spte, level); + + /* 010b (write-only) */ + WARN_ON((spte 0x7) == 0x2); + + /* 110b (write/execute) */ + WARN_ON((spte 0x7) == 0x6); + + /* 100b (execute-only) and value not supported by logical processor */ + if (!cpu_has_vmx_ept_execute_only()) + WARN_ON((spte 0x7) == 0x4); + + /* not 000b */ + if ((spte 0x7)) { + u64 rsvd_bits = spte ept_rsvd_mask(spte, level); + + if (rsvd_bits != 0) { + printk(KERN_ERR %s: rsvd_bits = 0x%llx\n, +__func__, rsvd_bits); + WARN_ON(1); + } + + if (level == 1 || (level == 2 (spte (1ULL 7 { + u64 ept_mem_type = (spte 0x38) 3; + + if (ept_mem_type == 2 || ept_mem_type == 3 || + ept_mem_type == 7) { + printk(KERN_ERR %s: ept_mem_type=0x%llx\n, + __func__, ept_mem_type); + WARN_ON(1); + } + } + } +} + +static int handle_ept_misconfig(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) +{ + u64 sptes[4]; + int nr_sptes, i; + gpa_t gpa; + + gpa = vmcs_read64(GUEST_PHYSICAL_ADDRESS); + + printk(KERN_ERR EPT: Misconfiguration.\n); + printk(KERN_ERR EPT: GPA: 0x%llx\n, gpa); + + nr_sptes = kvm_mmu_get_spte_hierarchy(vcpu, gpa, sptes); + + for (i = PT64_ROOT_LEVEL; i PT64_ROOT_LEVEL - nr_sptes; --i) + ept_misconfig_inspect_spte(vcpu, sptes[i-1], i); + + kvm_run-exit_reason = KVM_EXIT_UNKNOWN; + kvm_run-hw.hardware_exit_reason = EXIT_REASON_EPT_MISCONFIG; + + return 0; +} + static int handle_nmi_window(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) { u32 cpu_based_vm_exec_control; @@ -3303,8 +3386,9 @@ static int (*kvm_vmx_exit_handlers[])(st [EXIT_REASON_APIC_ACCESS] = handle_apic_access, [EXIT_REASON_WBINVD] = handle_wbinvd, [EXIT_REASON_TASK_SWITCH] = handle_task_switch, - [EXIT_REASON_EPT_VIOLATION] = handle_ept_violation, [EXIT_REASON_MCE_DURING_VMENTRY] = handle_machine_check, + [EXIT_REASON_EPT_VIOLATION] = handle_ept_violation, + [EXIT_REASON_EPT_MISCONFIG] = handle_ept_misconfig, }; static const int kvm_vmx_max_exit_handlers = -- -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 3/5] KVM: MMU: add kvm_mmu_get_spte_hierarchy helper
Required by EPT misconfiguration handler. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com Index: kvm/arch/x86/kvm/mmu.c === --- kvm.orig/arch/x86/kvm/mmu.c +++ kvm/arch/x86/kvm/mmu.c @@ -3013,6 +3013,24 @@ out: return r; } +int kvm_mmu_get_spte_hierarchy(struct kvm_vcpu *vcpu, u64 addr, u64 sptes[4]) +{ + struct kvm_shadow_walk_iterator iterator; + int nr_sptes = 0; + + spin_lock(vcpu-kvm-mmu_lock); + for_each_shadow_entry(vcpu, addr, iterator) { + sptes[iterator.level-1] = *iterator.sptep; + nr_sptes++; + if (!is_shadow_present_pte(*iterator.sptep)) + break; + } + spin_unlock(vcpu-kvm-mmu_lock); + + return nr_sptes; +} +EXPORT_SYMBOL_GPL(kvm_mmu_get_spte_hierarchy); + #ifdef AUDIT static const char *audit_msg; Index: kvm/arch/x86/kvm/mmu.h === --- kvm.orig/arch/x86/kvm/mmu.h +++ kvm/arch/x86/kvm/mmu.h @@ -37,6 +37,8 @@ #define PT32_ROOT_LEVEL 2 #define PT32E_ROOT_LEVEL 3 +int kvm_mmu_get_spte_hierarchy(struct kvm_vcpu *vcpu, u64 addr, u64 sptes[4]); + static inline void kvm_mmu_free_some_pages(struct kvm_vcpu *vcpu) { if (unlikely(vcpu-kvm-arch.n_free_mmu_pages KVM_MIN_FREE_MMU_PAGES)) -- -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 2/5] KVM: MMU: make for_each_shadow_entry aware of largepages
This way there is no need to add explicit checks in every for_each_shadow_entry user. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com Index: kvm/arch/x86/kvm/mmu.c === --- kvm.orig/arch/x86/kvm/mmu.c +++ kvm/arch/x86/kvm/mmu.c @@ -1273,6 +1273,11 @@ static bool shadow_walk_okay(struct kvm_ { if (iterator-level PT_PAGE_TABLE_LEVEL) return false; + + if (iterator-level == PT_PAGE_TABLE_LEVEL) + if (is_large_pte(*iterator-sptep)) + return false; + iterator-index = SHADOW_PT_INDEX(iterator-addr, iterator-level); iterator-sptep = ((u64 *)__va(iterator-shadow_addr)) + iterator-index; return true; -- -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 5/5] KVM: VMX: conditionally disable 2M pages
Disable usage of 2M pages if VMX_EPT_2MB_PAGE_BIT (bit 16) is clear in MSR_IA32_VMX_EPT_VPID_CAP and EPT is enabled. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com Index: kvm/arch/x86/kvm/vmx.c === --- kvm.orig/arch/x86/kvm/vmx.c +++ kvm/arch/x86/kvm/vmx.c @@ -1393,6 +1393,9 @@ static __init int hardware_setup(void) if (!cpu_has_vmx_tpr_shadow()) kvm_x86_ops-update_cr8_intercept = NULL; + if (enable_ept !cpu_has_vmx_ept_2m_page()) + kvm_disable_largepages(); + return alloc_kvm_area(); } Index: kvm/include/linux/kvm_host.h === --- kvm.orig/include/linux/kvm_host.h +++ kvm/include/linux/kvm_host.h @@ -219,6 +219,7 @@ int kvm_arch_set_memory_region(struct kv struct kvm_userspace_memory_region *mem, struct kvm_memory_slot old, int user_alloc); +void kvm_disable_largepages(void); void kvm_arch_flush_shadow(struct kvm *kvm); gfn_t unalias_gfn(struct kvm *kvm, gfn_t gfn); struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn); Index: kvm/virt/kvm/kvm_main.c === --- kvm.orig/virt/kvm/kvm_main.c +++ kvm/virt/kvm/kvm_main.c @@ -85,6 +85,8 @@ static long kvm_vcpu_ioctl(struct file * static bool kvm_rebooting; +static bool largepages_disabled = false; + #ifdef KVM_CAP_DEVICE_ASSIGNMENT static struct kvm_assigned_dev_kernel *kvm_find_assigned_dev(struct list_head *head, int assigned_dev_id) @@ -1171,9 +1173,11 @@ int __kvm_set_memory_region(struct kvm * ugfn = new.userspace_addr PAGE_SHIFT; /* * If the gfn and userspace address are not aligned wrt each -* other, disable large page support for this slot +* other, or if explicitly asked to, disable large page +* support for this slot */ - if ((base_gfn ^ ugfn) (KVM_PAGES_PER_HPAGE - 1)) + if ((base_gfn ^ ugfn) (KVM_PAGES_PER_HPAGE - 1) || + largepages_disabled) for (i = 0; i largepages; ++i) new.lpage_info[i].write_count = 1; } @@ -1286,6 +1290,12 @@ out: return r; } +void kvm_disable_largepages(void) +{ + largepages_disabled = true; +} +EXPORT_SYMBOL_GPL(kvm_disable_largepages); + int is_error_page(struct page *page) { return page == bad_page; -- -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] msi-x: let drivers retry when not enough vectors
On Thu, 7 May 2009 11:28:41 +0300 Michael S. Tsirkin m...@redhat.com wrote: pci_enable_msix currently returns -EINVAL if you ask for more vectors than supported by the device, which would typically cause fallback to regular interrupts. It's better to return the table size, making the driver retry MSI-X with less vectors. Signed-off-by: Michael S. Tsirkin m...@redhat.com --- Hi Jesse, This came up when I was adding MSI-X support to virtio pci driver, which does not know the exact table size upfront. Could you consider this patch for 2.6.31 please? Applied this one to my linux-next branch; hopefully Rusty won't mind too much. :) -- Jesse Barnes, Intel Open Source Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/5] BIOS changes for configuring irq0-inti2 override (v4)
Beth Kon wrote: Sebastian Herbszt wrote: Beth Kon wrote: These patches resolve the irq0-inti2 override issue, and get the hpet working on kvm. Override and HPET changes are sent as a series because HPET depends on the override. Win2k8 expects the HPET interrupt on inti2, regardless of whether an override exists in the BIOS. And the HPET spec states that in legacy mode, timer interrupt is on inti2. The irq0-inti2 override will always be used unless the kernel cannot do irq routing (i.e., compatibility with old kernels). So if the kernel is capable, userspace sets up irq0-inti2 via the irq routing interface, and adds the irq0-inti2 override to the MADT interrupt source override table, and the mp table (for the no-acpi case). Changes from v3: - changes based on comments from Avi and Gleb. - corrected legacy enable/disable for in-kernel PIT. The code now best approximates a multiplexer that disables PIT interrupts when HPET is in legacy mode (as described by HPET spec). Any changes to the PIT that may occur while HPET is operating in legacy mode are saved, so if HPET leaves legacy mode, the PIT is just reenabled, with mode set to whatever the last setting from guest was. Legacy mode is disabled at least during crash and shutdown (in Linux), so this needs to be handled properly. --- kvm/bios/rombios32.c | 60 - 1 files changed, 44 insertions(+), 16 deletions(-) What about the mptable entry count? Think it would need something like #ifdef BX_QEMU if (irq0_override) putle16(q, smp_cpus + 17); /* entry count */ else putle16(q, smp_cpus + 18); /* entry count */ #else putle16(q, smp_cpus + 18); /* entry count */ #endif Your patch Fix non-ACPI Timer Interrupt Routing - v3 [1] included such a change. [1] http://lists.gnu.org/archive/html/qemu-devel/2009-04/msg01396.html Yes, I lost that somehow! Thanks (again!). Actually, it isn't that simple. That patch that you referred to was a qemu patch. But I still don't see it in qemu-patched bochs bios. Apparently, I did neglect to add it to the kvm bios patches that I had waiting. Anthony, do you know what happened to this patch? - Sebastian -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/5] Userspace changes for configuring irq0-inti2 override (v6)
These patches resolve the irq0-inti2 override issue, and get the hpet working on kvm. Override and HPET changes are sent as a series because HPET depends on the override. Win2k8 expects the HPET interrupt on inti2, regardless of whether an override exists in the BIOS. And the HPET spec states that in legacy mode, timer interrupt is on inti2. The irq0-inti2 override will always be used unless the kernel cannot do irq routing (i.e., compatibility with old kernels). So if the kernel is capable, userspace sets up irq0-inti2 via the irq routing interface, and adds the irq0-inti2 override to the MADT interrupt source override table, and the mp table (for the no-acpi case). Changes from v3: - changes based on comments from Avi and Gleb. - corrected legacy enable/disable for in-kernel PIT. The code now best approximates a multiplexer that disables PIT interrupts when HPET is in legacy mode (as described by HPET spec). Any changes to the PIT that may occur while HPET is operating in legacy mode are saved, so if HPET leaves legacy mode, the PIT is just reenabled, with mode set to whatever the last setting from guest was. Legacy mode is disabled at least during crash and shutdown (in Linux), so this needs to be handled properly. Changes from v4: - Modify mp_table entry count depending on whether irq_override is enabled. Signed-off-by: Beth Kon e...@us.ibm.com --- kvm/bios/rombios32.c | 67 ++ 1 files changed, 51 insertions(+), 16 deletions(-) diff --git a/kvm/bios/rombios32.c b/kvm/bios/rombios32.c index 7db91d8..d6886ee 100755 --- a/kvm/bios/rombios32.c +++ b/kvm/bios/rombios32.c @@ -446,6 +446,9 @@ uint32_t cpuid_features; uint32_t cpuid_ext_features; unsigned long ram_size; uint64_t ram_end; +#ifdef BX_QEMU +uint8_t irq0_override; +#endif #ifdef BX_USE_EBDA_TABLES unsigned long ebda_cur_addr; #endif @@ -487,6 +490,7 @@ void wrmsr_smp(uint32_t index, uint64_t val) #define QEMU_CFG_ARCH_LOCAL 0x8000 #define QEMU_CFG_ACPI_TABLES (QEMU_CFG_ARCH_LOCAL + 0) #define QEMU_CFG_SMBIOS_ENTRIES (QEMU_CFG_ARCH_LOCAL + 1) +#define QEMU_CFG_IRQ0_OVERRIDE (QEMU_CFG_ARCH_LOCAL + 2) int qemu_cfg_port; @@ -555,6 +559,17 @@ uint64_t qemu_cfg_get64 (void) } #endif +#ifdef BX_QEMU +void irq0_override_probe(void) +{ +if(qemu_cfg_port) { +qemu_cfg_select(QEMU_CFG_IRQ0_OVERRIDE); +qemu_cfg_read(irq0_override, 1); +return; +} +} +#endif + void cpu_probe(void) { uint32_t eax, ebx, ecx, edx; @@ -1153,7 +1168,14 @@ static void mptable_init(void) putstr(q, 0.1 ); /* vendor id */ putle32(q, 0); /* OEM table ptr */ putle16(q, 0); /* OEM table size */ +#ifdef BX_QEMU +if (irq0_override) +putle16(q, MAX_CPUS + 17); /* entry count */ +else +putle16(q, MAX_CPUS + 18); /* entry count */ +#else putle16(q, MAX_CPUS + 18); /* entry count */ +#endif putle32(q, 0xfee0); /* local APIC addr */ putle16(q, 0); /* ext table length */ putb(q, 0); /* ext table checksum */ @@ -1197,6 +1219,13 @@ static void mptable_init(void) /* irqs */ for(i = 0; i 16; i++) { +#ifdef BX_QEMU +/* One entry per ioapic interrupt destination. Destination 2 is covered + * by irq0-inti2 override (i == 0). Source IRQ 2 is unused + */ +if (irq0_override i == 2) +continue; +#endif putb(q, 3); /* entry type = I/O interrupt */ putb(q, 0); /* interrupt type = vectored interrupt */ putb(q, 0); /* flags: po=0, el=0 */ @@ -1204,7 +1233,12 @@ static void mptable_init(void) putb(q, 0); /* source bus ID = ISA */ putb(q, i); /* source bus IRQ */ putb(q, ioapic_id); /* dest I/O APIC ID */ -putb(q, i); /* dest I/O APIC interrupt in */ +#ifdef BX_QEMU +if (irq0_override i == 0) +putb(q, 2); /* dest I/O APIC interrupt in */ +else +#endif +putb(q, i); /* dest I/O APIC interrupt in */ } /* patch length */ len = q - mp_config_table; @@ -1760,23 +1794,21 @@ void acpi_bios_init(void) io_apic-io_apic_id = smp_cpus; io_apic-address = cpu_to_le32(0xfec0); io_apic-interrupt = cpu_to_le32(0); -#ifdef BX_QEMU -#ifdef HPET_WORKS_IN_KVM io_apic++; - -int_override = (void *)io_apic; -int_override-type = APIC_XRUPT_OVERRIDE; -int_override-length = sizeof(*int_override); -int_override-bus = cpu_to_le32(0); -int_override-source = cpu_to_le32(0); -int_override-gsi = cpu_to_le32(2); -int_override-flags = cpu_to_le32(0); -#endif +int_override = (struct madt_int_override*)(io_apic); +#ifdef BX_QEMU +if (irq0_override) { +memset(int_override, 0, sizeof(*int_override)); +int_override-type = APIC_XRUPT_OVERRIDE; +int_override-length = sizeof(*int_override); +int_override-source =
[PATCH 2/5] Userspace changes for configuring irq0-inti2 override (v6)
Signed-off-by: Beth Kon e...@us.ibm.com --- hw/ioapic.c|6 +++--- hw/pc.c|2 ++ qemu-kvm-x86.c |6 +- qemu-kvm.h |2 ++ sysemu.h |1 + vl.c | 11 +-- 6 files changed, 22 insertions(+), 6 deletions(-) diff --git a/hw/ioapic.c b/hw/ioapic.c index 6c178c7..a67b766 100644 --- a/hw/ioapic.c +++ b/hw/ioapic.c @@ -23,6 +23,7 @@ #include hw.h #include pc.h +#include sysemu.h #include qemu-timer.h #include host-utils.h @@ -95,14 +96,13 @@ void ioapic_set_irq(void *opaque, int vector, int level) { IOAPICState *s = opaque; -#if 0 /* ISA IRQs map to GSI 1-1 except for IRQ0 which maps * to GSI 2. GSI maps to ioapic 1-1. This is not * the cleanest way of doing it but it should work. */ -if (vector == 0) +if (vector == 0 irq0override) { vector = 2; -#endif +} if (vector = 0 vector IOAPIC_NUM_PINS) { uint32_t mask = 1 vector; diff --git a/hw/pc.c b/hw/pc.c index 66f4635..1c068fb 100644 --- a/hw/pc.c +++ b/hw/pc.c @@ -55,6 +55,7 @@ #define BIOS_CFG_IOPORT 0x510 #define FW_CFG_ACPI_TABLES (FW_CFG_ARCH_LOCAL + 0) #define FW_CFG_SMBIOS_ENTRIES (FW_CFG_ARCH_LOCAL + 1) +#define FW_CFG_IRQ0_OVERRIDE (FW_CFG_ARCH_LOCAL + 2) #define MAX_IDE_BUS 2 @@ -476,6 +477,7 @@ static void bochs_bios_init(void) fw_cfg_add_i64(fw_cfg, FW_CFG_RAM_SIZE, (uint64_t)ram_size); fw_cfg_add_bytes(fw_cfg, FW_CFG_ACPI_TABLES, (uint8_t *)acpi_tables, acpi_tables_len); +fw_cfg_add_bytes(fw_cfg, FW_CFG_IRQ0_OVERRIDE, irq0override, 1); smbios_table = smbios_get_table(smbios_len); if (smbios_table) diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c index 5526d8f..89337e9 100644 --- a/qemu-kvm-x86.c +++ b/qemu-kvm-x86.c @@ -909,7 +909,11 @@ int kvm_arch_init_irq_routing(void) return r; } for (i = 0; i 24; ++i) { -r = kvm_add_irq_route(kvm_context, i, KVM_IRQCHIP_IOAPIC, i); +if (i == 0) { +r = kvm_add_irq_route(kvm_context, i, KVM_IRQCHIP_IOAPIC, 2); +} else if (i != 2) { +r = kvm_add_irq_route(kvm_context, i, KVM_IRQCHIP_IOAPIC, i); +} if (r 0) return r; } diff --git a/qemu-kvm.h b/qemu-kvm.h index fa40542..6bbafbc 100644 --- a/qemu-kvm.h +++ b/qemu-kvm.h @@ -169,6 +169,7 @@ int handle_tpr_access(void *opaque, kvm_vcpu_context_t vcpu, #define kvm_enabled() (kvm_allowed) #define qemu_kvm_irqchip_in_kernel() kvm_irqchip_in_kernel(kvm_context) #define qemu_kvm_pit_in_kernel() kvm_pit_in_kernel(kvm_context) +#define qemu_kvm_has_gsi_routing() kvm_has_gsi_routing(kvm_context) #define kvm_has_sync_mmu() qemu_kvm_has_sync_mmu() void kvm_init_vcpu(CPUState *env); void kvm_load_tsc(CPUState *env); @@ -177,6 +178,7 @@ void kvm_load_tsc(CPUState *env); #define kvm_nested 0 #define qemu_kvm_irqchip_in_kernel() (0) #define qemu_kvm_pit_in_kernel() (0) +#define qemu_kvm_has_gsi_routing() (0) #define kvm_has_sync_mmu() (0) #define kvm_load_registers(env) do {} while(0) #define kvm_save_registers(env) do {} while(0) diff --git a/sysemu.h b/sysemu.h index 47d001e..f78e974 100644 --- a/sysemu.h +++ b/sysemu.h @@ -108,6 +108,7 @@ extern int xenfb_enabled; extern int graphic_width; extern int graphic_height; extern int graphic_depth; +extern uint8_t irq0override; extern DisplayType display_type; extern const char *keyboard_layout; extern int win2k_install_hack; diff --git a/vl.c b/vl.c index 2fda17b..9b1d1ab 100644 --- a/vl.c +++ b/vl.c @@ -253,6 +253,7 @@ int no_reboot = 0; int no_shutdown = 0; int cursor_hide = 1; int graphic_rotate = 0; +uint8_t irq0override = 1; #ifndef _WIN32 int daemonize = 0; #endif @@ -6054,8 +6055,14 @@ int main(int argc, char **argv, char **envp) module_call_init(MODULE_INIT_DEVICE); -if (kvm_enabled()) - kvm_init_ap(); +if (kvm_enabled()) { + kvm_init_ap(); +#ifdef USE_KVM +if (kvm_irqchip !qemu_kvm_has_gsi_routing()) { +irq0override = 0; +} +#endif +} machine-init(ram_size, boot_devices, kernel_filename, kernel_cmdline, initrd_filename, cpu_model); -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/5] BIOS changes for KVM HPET (v6)
Signed-off-by: Beth Kon e...@us.ibm.com --- kvm/bios/acpi-dsdt.dsl |2 -- kvm/bios/rombios32.c | 11 +++ 2 files changed, 3 insertions(+), 10 deletions(-) diff --git a/kvm/bios/acpi-dsdt.dsl b/kvm/bios/acpi-dsdt.dsl index db57307..71d0a5e 100755 --- a/kvm/bios/acpi-dsdt.dsl +++ b/kvm/bios/acpi-dsdt.dsl @@ -296,7 +296,6 @@ DefinitionBlock ( }) } #ifdef BX_QEMU -#ifdef HPET_WORKS_IN_KVM Device(HPET) { Name(_HID, EISAID(PNP0103)) Name(_UID, 0) @@ -316,7 +315,6 @@ DefinitionBlock ( }) } #endif -#endif } Scope(\_SB.PCI0) { diff --git a/kvm/bios/rombios32.c b/kvm/bios/rombios32.c index 9d6910e..1106f38 100755 --- a/kvm/bios/rombios32.c +++ b/kvm/bios/rombios32.c @@ -1518,8 +1518,8 @@ struct acpi_20_generic_address { } __attribute__((__packed__)); /* - * * HPET Description Table - * */ + * HPET Description Table + */ struct acpi_20_hpet { ACPI_TABLE_HEADER_DEF /* ACPI common table header */ uint32_t timer_block_id; @@ -1703,13 +1703,11 @@ void acpi_bios_init(void) addr += madt_size; #ifdef BX_QEMU -#ifdef HPET_WORKS_IN_KVM addr = (addr + 7) ~7; hpet_addr = addr; hpet = (void *)(addr); addr += sizeof(*hpet); #endif -#endif /* RSDP */ memset(rsdp, 0, sizeof(*rsdp)); @@ -1883,7 +1881,6 @@ void acpi_bios_init(void) } /* HPET */ -#ifdef HPET_WORKS_IN_KVM memset(hpet, 0, sizeof(*hpet)); /* Note timer_block_id value must be kept in sync with value advertised by * emulated hpet @@ -1892,7 +1889,6 @@ void acpi_bios_init(void) hpet-addr.address = cpu_to_le32(ACPI_HPET_ADDRESS); acpi_build_table_header((struct acpi_table_header *)hpet, HPET, sizeof(*hpet), 1); -#endif acpi_additional_tables(); /* resets cfg to required entry */ for(i = 0; i external_tables; i++) { @@ -1912,8 +1908,7 @@ void acpi_bios_init(void) /* kvm has no ssdt (processors are in dsdt) */ // rsdt-table_offset_entry[nb_rsdt_entries++] = cpu_to_le32(ssdt_addr); #ifdef BX_QEMU -/* No HPET (yet) */ -// rsdt-table_offset_entry[nb_rsdt_entries++] = cpu_to_le32(hpet_addr); +rsdt-table_offset_entry[nb_rsdt_entries++] = cpu_to_le32(hpet_addr); if (nb_numa_nodes 0) rsdt-table_offset_entry[nb_rsdt_entries++] = cpu_to_le32(srat_addr); #endif -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/5] Userspace changes for KVM HPET (v6)
The big change here is handling of enabling/disabling of hpet legacy mode. When hpet enters legacy mode, the spec says that the pit stops generating interrupts. In practice, we want to stop the pit periodic timer from running because it is wasteful in a virtual environment. We also have to worry about the hpet leaving legacy mode (which, at least in linux, happens only during a shutdown or crash). At this point, according to the hpet spec, PIT interrupts need to be reenabled. For us, it means the PIT timer needs to be restarted. This patch handles this situation better than the previous version by coming closer to just disabling PIT interrupts. It allows the PIT state to change if the OS modifies it, even while PIT is disabled, but does not allow a pit timer to start. Then if HPET legacy mode is disabled, whatever the PIT state is at that point, the PIT timer is restarted accordingly. Signed-off-by: Beth Kon e...@us.ibm.com --- diff --git a/hw/hpet.c b/hw/hpet.c index 29db325..043b92b 100644 --- a/hw/hpet.c +++ b/hw/hpet.c @@ -206,6 +206,9 @@ static int hpet_load(QEMUFile *f, void *opaque, int version_id) qemu_get_timer(f, s-timer[i].qemu_timer); } } +if (hpet_in_legacy_mode()) { +hpet_disable_pit(); +} return 0; } @@ -475,9 +478,11 @@ static void hpet_ram_writel(void *opaque, target_phys_addr_t addr, } /* i8254 and RTC are disabled when HPET is in legacy mode */ if (activating_bit(old_val, new_val, HPET_CFG_LEGACY)) { -hpet_pit_disable(); +hpet_disable_pit(); +dprintf(qemu: hpet disabled pit\n); } else if (deactivating_bit(old_val, new_val, HPET_CFG_LEGACY)) { -hpet_pit_enable(); +hpet_enable_pit(); +dprintf(qemu: hpet enabled pit\n); } break; case HPET_CFG + 4: @@ -554,13 +559,15 @@ static void hpet_reset(void *opaque) { /* 64-bit main counter; 3 timers supported; LegacyReplacementRoute. */ s-capability = 0x8086a201ULL; s-capability |= ((HPET_CLK_PERIOD) 32); -if (count 0) +if (count 0) { /* we don't enable pit when hpet_reset is first called (by hpet_init) * because hpet is taking over for pit here. On subsequent invocations, * hpet_reset is called due to system reset. At this point control must * be returned to pit until SW reenables hpet. */ -hpet_pit_enable(); +hpet_enable_pit(); +dprintf(qemu: hpet enabled pit\n); +} count = 1; } diff --git a/hw/i8254.c b/hw/i8254.c index 2f229f9..8c8076f 100644 --- a/hw/i8254.c +++ b/hw/i8254.c @@ -25,6 +25,7 @@ #include pc.h #include isa.h #include qemu-timer.h +#include qemu-kvm.h #include i8254.h //#define DEBUG_PIT @@ -198,6 +199,9 @@ int pit_get_mode(PITState *pit, int channel) static inline void pit_load_count(PITChannelState *s, int val) { +if (s-channel == 0 pit_state.hpet_legacy_mode) { +return; +} if (val == 0) val = 0x1; s-count_load_time = qemu_get_clock(vm_clock); @@ -371,10 +375,11 @@ static void pit_irq_timer_update(PITChannelState *s, int64_t current_time) (double)(expire_time - current_time) / ticks_per_sec); #endif s-next_transition_time = expire_time; -if (expire_time != -1) +if (expire_time != -1) { qemu_mod_timer(s-irq_timer, expire_time); -else +} else { qemu_del_timer(s-irq_timer); +} } static void pit_irq_timer(void *opaque) @@ -451,6 +456,7 @@ void pit_reset(void *opaque) PITChannelState *s; int i; +pit-hpet_legacy_mode = 0; for(i = 0;i 3; i++) { s = pit-channels[i]; s-mode = 3; @@ -460,32 +466,43 @@ void pit_reset(void *opaque) } /* When HPET is operating in legacy mode, i8254 timer0 is disabled */ -void hpet_pit_disable(void) { -PITChannelState *s; -s = pit_state.channels[0]; -if (s-irq_timer) -qemu_del_timer(s-irq_timer); + +void hpet_disable_pit(void) +{ +PITChannelState *s = pit_state.channels[0]; +if (qemu_kvm_pit_in_kernel()) { +kvm_hpet_disable_kpit(); +} else { +if (s-irq_timer) { +qemu_del_timer(s-irq_timer); +} +} } /* When HPET is reset or leaving legacy mode, it must reenable i8254 * timer 0 */ -void hpet_pit_enable(void) +void hpet_enable_pit(void) { PITState *pit = pit_state; -PITChannelState *s; -s = pit-channels[0]; -s-mode = 3; -s-gate = 1; -pit_load_count(s, 0); +PITChannelState *s = pit-channels[0]; +if (qemu_kvm_pit_in_kernel()) { +kvm_hpet_enable_kpit(); +} else { +pit_load_count(s, s-count); +} } PITState *pit_init(int base, qemu_irq irq) { PITState *pit = pit_state; PITChannelState *s; +int i;
[PATCH 5/5] HPET interaction with in-kernel PIT (v6)
Signed-off-by: Beth Kon e...@us.ibm.com --- arch/x86/include/asm/kvm.h |1 + arch/x86/kvm/i8254.c | 24 +++- arch/x86/kvm/i8254.h |3 ++- arch/x86/kvm/x86.c |5 - 4 files changed, 26 insertions(+), 7 deletions(-) diff --git a/arch/x86/include/asm/kvm.h b/arch/x86/include/asm/kvm.h index 708b9c3..3c44923 100644 --- a/arch/x86/include/asm/kvm.h +++ b/arch/x86/include/asm/kvm.h @@ -235,6 +235,7 @@ struct kvm_guest_debug_arch { struct kvm_pit_state { struct kvm_pit_channel_state channels[3]; + u8 hpet_legacy_mode; }; struct kvm_reinject_control { diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c index 331705f..bb8382b 100644 --- a/arch/x86/kvm/i8254.c +++ b/arch/x86/kvm/i8254.c @@ -340,10 +340,20 @@ static void pit_load_count(struct kvm *kvm, int channel, u32 val) } } -void kvm_pit_load_count(struct kvm *kvm, int channel, u32 val) +void kvm_pit_load_count(struct kvm *kvm, int channel, u32 val, int hpet_legacy_start) { + u8 saved_mode; mutex_lock(kvm-arch.vpit-pit_state.lock); - pit_load_count(kvm, channel, val); + if (hpet_legacy_start) { + /* save existing mode for later reenablement */ + saved_mode = kvm-arch.vpit-pit_state.channels[0].mode; + kvm-arch.vpit-pit_state.channels[0].mode = 0xff; /* disable timer */ + pit_load_count(kvm, channel, val); + kvm-arch.vpit-pit_state.channels[0].mode = saved_mode; + } else { + if (!(channel == 0 kvm-arch.vpit-pit_state.hpet_legacy_mode)) + pit_load_count(kvm, channel, val); + } mutex_unlock(kvm-arch.vpit-pit_state.lock); } @@ -411,17 +421,20 @@ static void pit_ioport_write(struct kvm_io_device *this, switch (s-write_state) { default: case RW_STATE_LSB: - pit_load_count(kvm, addr, val); + if (!(addr == 0 pit_state-hpet_legacy_mode)) + pit_load_count(kvm, addr, val); break; case RW_STATE_MSB: - pit_load_count(kvm, addr, val 8); + if (!(addr == 0 pit_state-hpet_legacy_mode)) + pit_load_count(kvm, addr, val 8); break; case RW_STATE_WORD0: s-write_latch = val; s-write_state = RW_STATE_WORD1; break; case RW_STATE_WORD1: - pit_load_count(kvm, addr, s-write_latch | (val 8)); + if (!(addr == 0 pit_state-hpet_legacy_mode)) + pit_load_count(kvm, addr, s-write_latch | (val 8)); s-write_state = RW_STATE_WORD0; break; } @@ -548,6 +561,7 @@ void kvm_pit_reset(struct kvm_pit *pit) struct kvm_kpit_channel_state *c; mutex_lock(pit-pit_state.lock); + pit-pit_state.hpet_legacy_mode = 0; for (i = 0; i 3; i++) { c = pit-pit_state.channels[i]; c-mode = 0xff; diff --git a/arch/x86/kvm/i8254.h b/arch/x86/kvm/i8254.h index b267018..b5967ca 100644 --- a/arch/x86/kvm/i8254.h +++ b/arch/x86/kvm/i8254.h @@ -21,6 +21,7 @@ struct kvm_kpit_channel_state { struct kvm_kpit_state { struct kvm_kpit_channel_state channels[3]; + u8 hpet_legacy_mode; struct kvm_timer pit_timer; bool is_periodic; u32speaker_data_on; @@ -49,7 +50,7 @@ struct kvm_pit { #define KVM_PIT_CHANNEL_MASK 0x3 void kvm_inject_pit_timer_irqs(struct kvm_vcpu *vcpu); -void kvm_pit_load_count(struct kvm *kvm, int channel, u32 val); +void kvm_pit_load_count(struct kvm *kvm, int channel, u32 val, int hpet_legacy_start); struct kvm_pit *kvm_create_pit(struct kvm *kvm, u32 flags); void kvm_free_pit(struct kvm *kvm); void kvm_pit_reset(struct kvm_pit *pit); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 1b91ea7..3c70545 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1948,9 +1948,12 @@ static int kvm_vm_ioctl_get_pit(struct kvm *kvm, struct kvm_pit_state *ps) static int kvm_vm_ioctl_set_pit(struct kvm *kvm, struct kvm_pit_state *ps) { int r = 0; + int hpet_legacy_start = 0; + if (ps-hpet_legacy_mode !kvm-arch.vpit-pit_state.hpet_legacy_mode) + hpet_legacy_start = 1; memcpy(kvm-arch.vpit-pit_state, ps, sizeof(struct kvm_pit_state)); - kvm_pit_load_count(kvm, 0, ps-channels[0].count); + kvm_pit_load_count(kvm, 0, ps-channels[0].count, hpet_legacy_start); return r; } -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [KVM PATCH v2 0/2] irqfd: use POLLHUP notification for close()
On Thu, Jun 04, 2009 at 08:48:02AM -0400, Gregory Haskins wrote: (Applies to kvm.git/master:25deed73) Please see the header for 2/2 for a description. This patch series has been fully tested and appears to be working correctly. [Review notes: *) Paul has looked at the SRCU design and, to my knowledge, didn't find any holes. *) Michael, Avi, and myself agree that while the removal of the DEASSIGN vector is not desirable, the fix on close() is more important in the short-term. We can always add DEASSIGN support again in the future with a CAP bit. ] So, I've been thinking about this, and this approach has another problem: it depends on pollhup on close which is AFAIK an eventfd-specific feature. This will prevent us from supporting polling other useful file types, such as sockets and pipes, down the road, with this interface. And there's DEASSIGN issue which is needed for migration and MSI vector remapping. I didn't realise these implications when I suggested deassign on close. To me, it now looks like we are better off reverting this patch. We can later add 'deassign on close' support with CAP bit after all :) Avi, Gregory, what's your take? -- MST -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [KVM PATCH v2 0/2] irqfd: use POLLHUP notification for close()
[ Resending with correct address for Davide. Pls don't reply to the original one, you'll get bounces. ] On Thu, Jun 04, 2009 at 08:48:02AM -0400, Gregory Haskins wrote: (Applies to kvm.git/master:25deed73) Please see the header for 2/2 for a description. This patch series has been fully tested and appears to be working correctly. [Review notes: *) Paul has looked at the SRCU design and, to my knowledge, didn't find any holes. *) Michael, Avi, and myself agree that while the removal of the DEASSIGN vector is not desirable, the fix on close() is more important in the short-term. We can always add DEASSIGN support again in the future with a CAP bit. ] So, I've been thinking about this, and this approach has another problem: it depends on pollhup on close which is AFAIK an eventfd-specific feature. This will prevent us from supporting polling other useful file types, such as sockets and pipes, down the road, with this interface. And there's DEASSIGN issue which is needed for migration and MSI vector remapping. I didn't realise these implications when I suggested deassign on close. To me, it now looks like we are better off reverting this patch. We can later add 'deassign on close' support with CAP bit after all :) Avi, Gregory, what's your take? -- MST -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html