[COMMIT master] Merge branch 'upstream-merge'
From: Avi Kivity a...@redhat.com * upstream-merge: (170 commits) Drop redundant pci_get_bus_devfn() declaration Fix old-style function definitions in kvm specific code Adjust pre_save()/post_load() vmstate callbacks to upstream usage Rename pci_create_noinit() to pci_create() Fix pci_add nic not to exit on bad model pci_create() is now unused, remove it Make it obvious that pci_nic_init() can't fail Fix pci_add storage not to exit on bad first argument Fix pci_vga_init() not to ignore bus argument set correct CS seg limit and flags on sipi Set revision in eeprom correctly for 82557 versions. restore CFLAGS check for conflict and fix recursive CFLAGS issue virtio-pci: return error if virtio_console_init fails qemu: clean up target page usage in msix qdev: show name of device that fails init vnc: Set invalid buffer pointers to NULL eepro100: Don't allow guests to fail assertions qcow2: Increase maximum cluster size to 2 MB qemu/virtio-pci: remove unnecessary check fix comment on cpu_register_physical_memory_offset ... Signed-off-by: Avi Kivity a...@redhat.com -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] Re-enable -Werror for kvm
From: Avi Kivity a...@redhat.com Signed-off-by: Avi Kivity a...@redhat.com diff --git a/configure b/configure index 6589dba..8866258 100755 --- a/configure +++ b/configure @@ -1833,8 +1833,6 @@ if test -z $werror ; then else werror=no fi -# disable default werror for kvm -werror=no fi if test $werror = yes ; then -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] test: vmexit: inl from pmtimer
From: Marcelo Tosatti mtosa...@redhat.com Add inl(ACPI_PMTIMER_PORT) test. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com Signed-off-by: Avi Kivity a...@redhat.com diff --git a/kvm/user/test/x86/vmexit.c b/kvm/user/test/x86/vmexit.c index 5455919..c3a01e0 100644 --- a/kvm/user/test/x86/vmexit.c +++ b/kvm/user/test/x86/vmexit.c @@ -17,6 +17,13 @@ static inline unsigned long long rdtsc() return r; } +static unsigned int inl(unsigned short port) +{ +unsigned int val; +asm volatile(inl %w1, %0 : =a(val) : Nd(port)); +return val; +} + #define GOAL (1ull 30) #ifdef __x86_64__ @@ -76,6 +83,11 @@ static void ipi_halt(void) ; } +static void inl_pmtimer(void) +{ +inl(0xb008); +} + static struct test { void (*func)(void); const char *name; @@ -86,6 +98,7 @@ static struct test { { vmcall, vmcall, .parallel = 1, }, { mov_from_cr8, mov_from_cr8, .parallel = 1, }, { mov_to_cr8, mov_to_cr8 , .parallel = 1, }, + { inl_pmtimer, inl_from_pmtimer, .parallel = 1, }, { ipi, ipi, is_smp, .parallel = 0, }, { ipi_halt, ipi+halt, is_smp, .parallel = 0, }, }; -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] test: add on_cpu_async
From: Marcelo Tosatti mtosa...@redhat.com Non-wait version of on_cpu. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com Signed-off-by: Avi Kivity a...@redhat.com diff --git a/kvm/user/test/lib/x86/smp.c b/kvm/user/test/lib/x86/smp.c index 9eface5..241f755 100644 --- a/kvm/user/test/lib/x86/smp.c +++ b/kvm/user/test/lib/x86/smp.c @@ -82,24 +82,38 @@ static void setup_smp_id(void *data) asm (mov %0, %%gs:0 : : r(apic_id()) : memory); } -void on_cpu(int cpu, void (*function)(void *data), void *data) +static void __on_cpu(int cpu, void (*function)(void *data), void *data, + int wait) { spin_lock(ipi_lock); if (cpu == smp_id()) function(data); else { + ipi_done = 0; ipi_function = function; ipi_data = data; apic_icr_write(APIC_INT_ASSERT | APIC_DEST_PHYSICAL | APIC_DM_FIXED | IPI_VECTOR, cpu); - while (!ipi_done) - ; - ipi_done = 0; + if (wait) { + while (!ipi_done) + ; + } } spin_unlock(ipi_lock); } +void on_cpu(int cpu, void (*function)(void *data), void *data) +{ +__on_cpu(cpu, function, data, 1); +} + +void on_cpu_async(int cpu, void (*function)(void *data), void *data) +{ +__on_cpu(cpu, function, data, 0); +} + + void smp_init(void) { int i; diff --git a/kvm/user/test/lib/x86/smp.h b/kvm/user/test/lib/x86/smp.h index bac7e14..c2e7350 100644 --- a/kvm/user/test/lib/x86/smp.h +++ b/kvm/user/test/lib/x86/smp.h @@ -10,6 +10,7 @@ void smp_init(void); int cpu_count(void); int smp_id(void); void on_cpu(int cpu, void (*function)(void *data), void *data); +void on_cpu_async(int cpu, void (*function)(void *data), void *data); void spin_lock(struct spinlock *lock); void spin_unlock(struct spinlock *lock); -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] Merge commit '1e7fbc6d3cfcffe1d490ab8851e712c6e98fa771'
From: Avi Kivity a...@redhat.com * commit '1e7fbc6d3cfcffe1d490ab8851e712c6e98fa771': x86: fix miss merge Signed-off-by: Avi Kivity a...@redhat.com -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] Revert qemu/pci: reset device registers on bus reset
From: Avi Kivity a...@redhat.com This reverts commit c0b1905b285800cfd1a797347efeac8338bfa655. It breaks Windows XP install autotest - Windows seems to drop to lose hibernate support with this patch. Signed-off-by: Avi Kivity a...@redhat.com diff --git a/hw/pci.c b/hw/pci.c index e4d088e..4472910 100644 --- a/hw/pci.c +++ b/hw/pci.c @@ -95,20 +95,7 @@ static inline int pci_bar(int reg) static void pci_device_reset(PCIDevice *dev) { -int r; - memset(dev-irq_state, 0, sizeof dev-irq_state); -dev-config[PCI_COMMAND] = ~(PCI_COMMAND_IO | PCI_COMMAND_MEMORY | - PCI_COMMAND_MASTER); -dev-config[PCI_CACHE_LINE_SIZE] = 0x0; -dev-config[PCI_INTERRUPT_LINE] = 0x0; -for (r = 0; r PCI_NUM_REGIONS; ++r) { -if (!dev-io_regions[r].size) { -continue; -} -pci_set_long(dev-config + pci_bar(r), dev-io_regions[r].type); -} -pci_update_mappings(dev); } static void pci_bus_reset(void *opaque) -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] Update souce link
From: Jan Kiszka jan.kis...@siemens.com This references KVM from stable 2.6.31. Signed-off-by: Jan Kiszka jan.kis...@siemens.com Signed-off-by: Marcelo Tosatti mtosa...@redhat.com diff --git a/linux-2.6 b/linux-2.6 index 46c6cf6..abb015a 16 --- a/linux-2.6 +++ b/linux-2.6 @@ -1 +1 @@ -Subproject commit 46c6cf63295e00af6092977800049a716757381f +Subproject commit abb015ac65852287c7a7c243c8cdee966a38854d -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] x86: Pick up local arch trace headers
From: Jan Kiszka jan.kis...@siemens.com This unbreaks 2.6.31 builds but also ensures that we always use the most recent ones. Signed-off-by: Jan Kiszka jan.kis...@siemens.com Signed-off-by: Marcelo Tosatti mtosa...@redhat.com diff --git a/include/arch/x86/kvm b/include/arch/x86/kvm new file mode 12 index 000..c635817 --- /dev/null +++ b/include/arch/x86/kvm @@ -0,0 +1 @@ +../../../x86 \ No newline at end of file -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] kvm_vma_kernel_pagesize support
From: Jan Kiszka jan.kis...@siemens.com It was broken for !CONFIG_HUGETLB_PAGE and for kernel 2.6.31. Signed-off-by: Jan Kiszka jan.kis...@siemens.com Signed-off-by: Marcelo Tosatti mtosa...@redhat.com diff --git a/external-module-compat-comm.h b/external-module-compat-comm.h index c72fb86..47fdc86 100644 --- a/external-module-compat-comm.h +++ b/external-module-compat-comm.h @@ -954,9 +954,10 @@ static inline int kvm_eventfd_signal(struct eventfd_ctx *ctx, int n) #include linux/hugetlb.h -/* vma_kernel_pagesize, 2.6.29 */ -#if LINUX_VERSION_CODE KERNEL_VERSION(2,6,31) +/* vma_kernel_pagesize, exported since 2.6.32 */ +#if LINUX_VERSION_CODE KERNEL_VERSION(2,6,32) +#ifdef CONFIG_HUGETLB_PAGE static inline unsigned long kvm_vma_kernel_pagesize(struct vm_area_struct *vma) { @@ -969,8 +970,11 @@ unsigned long kvm_vma_kernel_pagesize(struct vm_area_struct *vma) return 1UL (hstate-order + PAGE_SHIFT); } +#else /* !CONFIG_HUGETLB_SIZE */ +#define kvm_vma_kernel_pagesize(v) PAGE_SIZE +#endif -#else +#else /* = 2.6.32 */ #define kvm_vma_kernel_pagesize vma_kernel_pagesize -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] Fix warning in sync
From: Zachary Amsden zams...@redhat.com Patch is self-explanatory Signed-off-by: Zachary Amsden zams...@redhat.com Signed-off-by: Marcelo Tosatti mtosa...@redhat.com diff --git a/sync b/sync index b09f629..0bbd488 100755 --- a/sync +++ b/sync @@ -97,6 +97,9 @@ def __hack(data): line = '#include asm/types.h' if match(r'\t\.change_pte.*kvm_mmu_notifier_change_pte,'): line = '#ifdef MMU_NOTIFIER_HAS_CHANGE_PTE\n' + line + '\n#endif' +if match(r'static void kvm_mmu_notifier_change_pte'): +line = sub(r'static ', '', line) +line = '#ifdef MMU_NOTIFIER_HAS_CHANGE_PTE\n' + 'static\n' + '#endif\n' + line line = sub(r'\bhrtimer_init\b', 'hrtimer_init_p', line) line = sub(r'\bhrtimer_start\b', 'hrtimer_start_p', line) line = sub(r'\bhrtimer_cancel\b', 'hrtimer_cancel_p', line) -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] Fix arch include for KVM trace headers
From: Jan Kiszka jan.kis...@siemens.com Make sure recursive KVM trace header including works by adding the arch source directory to the search path. This is at least required for non-split kernel trees, but play safe and add it to both. Signed-off-by: Jan Kiszka jan.kis...@siemens.com Signed-off-by: Marcelo Tosatti mtosa...@redhat.com diff --git a/Makefile b/Makefile index ad08c45..37a14e1 100644 --- a/Makefile +++ b/Makefile @@ -29,7 +29,7 @@ all:: prerequisite LINUXINCLUDE=-I`pwd`/include -Iinclude \ $(if $(KERNELSOURCEDIR),\ -Iinclude2 -I$(KERNELSOURCEDIR)/include -I$(KERNELSOURCEDIR)/arch/${ARCH_DIR}/include, \ - -Iarch/${ARCH_DIR}/include) -I`pwd`/include-compat \ + -Iarch/${ARCH_DIR}/include) -I`pwd`/include-compat -I`pwd`/${ARCH_DIR} \ -include include/linux/autoconf.h \ -include `pwd`/$(ARCH_DIR)/external-module-compat.h $(module_defines) \ $$@ -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] ifdef out change_pte assignment
From: Marcelo Tosatti mtosa...@redhat.com Signed-off-by: Marcelo Tosatti mtosa...@redhat.com diff --git a/sync b/sync index 539a3f0..b09f629 100755 --- a/sync +++ b/sync @@ -95,6 +95,8 @@ def __hack(data): line = '' if match(r'#include linux\/types.h'): line = '#include asm/types.h' +if match(r'\t\.change_pte.*kvm_mmu_notifier_change_pte,'): +line = '#ifdef MMU_NOTIFIER_HAS_CHANGE_PTE\n' + line + '\n#endif' line = sub(r'\bhrtimer_init\b', 'hrtimer_init_p', line) line = sub(r'\bhrtimer_start\b', 'hrtimer_start_p', line) line = sub(r'\bhrtimer_cancel\b', 'hrtimer_cancel_p', line) -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] x86: Remove zombie kvm_trace from build
From: Jan Kiszka jan.kis...@siemens.com Signed-off-by: Jan Kiszka jan.kis...@siemens.com Signed-off-by: Marcelo Tosatti mtosa...@redhat.com diff --git a/x86/Kbuild b/x86/Kbuild index 0ccbeec..3499593 100644 --- a/x86/Kbuild +++ b/x86/Kbuild @@ -7,9 +7,6 @@ kvm-objs := kvm_main.o x86.o mmu.o emulate.o ../anon_inodes.o irq.o i8259.o \ lapic.o ioapic.o preempt.o i8254.o coalesced_mmio.o irq_comm.o \ timer.o eventfd.o assigned-dev.o \ ../external-module-compat.o ../request-irq-compat.o -ifeq ($(EXT_CONFIG_KVM_TRACE),y) -kvm-objs += kvm_trace.o -endif ifeq ($(CONFIG_IOMMU_API),y) kvm-objs += iommu.o endif -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [COMMIT master] ksm support
On 10/06/2009 09:21 PM, Anthony Liguori wrote: Avi Kivity wrote: From: Izik Eidus iei...@redhat.com Call madvise(MADV_MERGEABLE) on the memory allocations to allow the kernel to merge them. Signed-off-by: Izik Eidus iei...@redhat.com Signed-off-by: Avi Kivity a...@redhat.com Any reason to not send this to upstream qemu? Should apply fine, I think. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [COMMIT master] ksm support
Avi Kivity wrote: On 10/06/2009 09:21 PM, Anthony Liguori wrote: Avi Kivity wrote: From: Izik Eidus iei...@redhat.com Call madvise(MADV_MERGEABLE) on the memory allocations to allow the kernel to merge them. Signed-off-by: Izik Eidus iei...@redhat.com Signed-off-by: Avi Kivity a...@redhat.com Any reason to not send this to upstream qemu? Should apply fine, I think. Yup, I'm happy to merge it if someone sends it to qemu-devel... -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Alacrityvm-devel] [PATCH v2 2/4] KVM: introduce xinterface API for external interaction with guests
On 10/06/2009 08:18 PM, Ira W. Snyder wrote: The limitation I have is that memory made available from the host system (PCI card) as PCI BAR1 must not be migrated around in memory. I can only change the address decoding to hit a specific physical address. AFAIK, this means it cannot be userspace memory (since the underlying physical page could change, or it could be in swap), and must be allocated with something like __get_free_pages() or dma_alloc_coherent(). Expose it as /dev/something (/dev/mem, /sys/.../pci/...) and mmap() it, and it becomes non-pageable user memory. Not sure about dma_alloc_coherent(), that is meaningless on x86. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/4] KVM: introduce xinterface API for external interaction with guests
On 10/06/2009 09:40 PM, Gregory Haskins wrote: Thinking about this some more over lunch, I think we (Avi and I) might both be wrong (and David is right). Avi is right that we don't need rmb() or barrier() for the reasons already stated, but I think David is right that we need an smp_mb() to ensure the cpu doesn't do the reordering. Otherwise a different cpu could invalidate the memory if it reuses the freed memory in the meantime, iiuc. IOW: its not a compiler issue but a cpu issue. Or am I still confused? The sequence of operations is: v = p-v; f(); // rmb() ? g(v); You are worried that the compiler or cpu will fetch p-v after f() has executed? The compiler may not, since it can't tell whether f() might change p-v. If f() can cause another agent to write to p (by freeing it to a global list, for example), then it is its responsibility to issue the smp_rmb(), otherwise no calculation that took place before f() and accessed p is safe. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-commits] [COMMIT c0b1905] qemu/pci: reset device registers on bus reset
On 10/05/2009 04:53 PM, Anthony Liguori wrote: From: Michael S. Tsirkinm...@redhat.com Reset BARs and a couple of other registers on bus reset, as per PCI spec. This commit breaks Windows XP restart. After a restart Windows switches from 800x600 cirrus logic vga to 640x480 standard vga. My guess is that this is due to two mutually-cancelling bugs: - the bios fails to initialize one of the registers touched below - qemu sets up that register in the state the Windows expects the bios to leave it in instead of the on-reset state Once we perform the reset the register reverts to its correct reset state, the bios fails to initialize it, and Windows ignores the device. I reverted this commit from qemu-kvm.git. diff --git a/hw/pci.c b/hw/pci.c index 2dd7213..e2f88ff 100644 --- a/hw/pci.c +++ b/hw/pci.c @@ -92,7 +92,20 @@ static inline int pci_bar(int reg) static void pci_device_reset(PCIDevice *dev) { +int r; + memset(dev-irq_state, 0, sizeof dev-irq_state); +dev-config[PCI_COMMAND]= ~(PCI_COMMAND_IO | PCI_COMMAND_MEMORY | + PCI_COMMAND_MASTER); +dev-config[PCI_CACHE_LINE_SIZE] = 0x0; +dev-config[PCI_INTERRUPT_LINE] = 0x0; +for (r = 0; r PCI_NUM_REGIONS; ++r) { +if (!dev-io_regions[r].size) { +continue; +} +pci_set_long(dev-config + pci_bar(r), dev-io_regions[r].type); +} +pci_update_mappings(dev); } static void pci_bus_reset(void *opaque) -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-commits] [COMMIT c0b1905] qemu/pci: reset device registers on bus reset
On Wed, Oct 07, 2009 at 10:20:19AM +0200, Avi Kivity wrote: On 10/05/2009 04:53 PM, Anthony Liguori wrote: From: Michael S. Tsirkinm...@redhat.com Reset BARs and a couple of other registers on bus reset, as per PCI spec. This commit breaks Windows XP restart. I'll look into this. After a restart Windows switches from 800x600 cirrus logic vga to 640x480 standard vga. My guess is that this is due to two mutually-cancelling bugs: - the bios fails to initialize one of the registers touched below - qemu sets up that register in the state the Windows expects the bios to leave it in instead of the on-reset state Once we perform the reset the register reverts to its correct reset state, the bios fails to initialize it, and Windows ignores the device. I reverted this commit from qemu-kvm.git. diff --git a/hw/pci.c b/hw/pci.c index 2dd7213..e2f88ff 100644 --- a/hw/pci.c +++ b/hw/pci.c @@ -92,7 +92,20 @@ static inline int pci_bar(int reg) static void pci_device_reset(PCIDevice *dev) { +int r; + memset(dev-irq_state, 0, sizeof dev-irq_state); +dev-config[PCI_COMMAND]= ~(PCI_COMMAND_IO | PCI_COMMAND_MEMORY | + PCI_COMMAND_MASTER); +dev-config[PCI_CACHE_LINE_SIZE] = 0x0; +dev-config[PCI_INTERRUPT_LINE] = 0x0; +for (r = 0; r PCI_NUM_REGIONS; ++r) { +if (!dev-io_regions[r].size) { +continue; +} +pci_set_long(dev-config + pci_bar(r), dev-io_regions[r].type); +} +pci_update_mappings(dev); } static void pci_bus_reset(void *opaque) -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Network packet size over MTU in guests
Hi, We use KVM 75 on Gentoo hosts with kernel 2.6.27.8 with a variety of guests (Windows XP, 2000, 2003 several Linux distributions) in a production setup. The hosts are Intel Xeon based. We use bridged networking with E1000 model inside of guests. Recently, we began to have network problem (slowdowns, lags, timeouts) on several guests. Inside those guests, we saw with tcpdump that there were packets with a size well over MTU. On the host, tcpdump on the guest tap did show packets of 1500 bytes and that's all. For some of the guests (Debian based), changing the NIC model to rtl8139 instead of e1000 fixed the problem, but for others (Gentoo kernel 2.6.27.19) it didn't. Would it be a glitch in the KVM code or a driver problem ? Does this ring a bell to anyone ? Thanks. FT attachment: fabrice_toppi.vcf
Re: [Qemu-commits] [COMMIT c0b1905] qemu/pci: reset device registers on bus reset
On Wed, Oct 07, 2009 at 10:20:19AM +0200, Avi Kivity wrote: On 10/05/2009 04:53 PM, Anthony Liguori wrote: From: Michael S. Tsirkinm...@redhat.com Reset BARs and a couple of other registers on bus reset, as per PCI spec. This commit breaks Windows XP restart. After a restart Windows switches from 800x600 cirrus logic vga to 640x480 standard vga. My guess is that this is due to two mutually-cancelling bugs: - the bios fails to initialize one of the registers touched below - qemu sets up that register in the state the Windows expects the bios to leave it in instead of the on-reset state Yes, cirrus does this on init: pci_conf[0x04] = PCI_COMMAND_IOACCESS | PCI_COMMAND_MEMACCESS; Trying to understand what's the right thing to do is. Once we perform the reset the register reverts to its correct reset state, the bios fails to initialize it, and Windows ignores the device. I reverted this commit from qemu-kvm.git. diff --git a/hw/pci.c b/hw/pci.c index 2dd7213..e2f88ff 100644 --- a/hw/pci.c +++ b/hw/pci.c @@ -92,7 +92,20 @@ static inline int pci_bar(int reg) static void pci_device_reset(PCIDevice *dev) { +int r; + memset(dev-irq_state, 0, sizeof dev-irq_state); +dev-config[PCI_COMMAND]= ~(PCI_COMMAND_IO | PCI_COMMAND_MEMORY | + PCI_COMMAND_MASTER); +dev-config[PCI_CACHE_LINE_SIZE] = 0x0; +dev-config[PCI_INTERRUPT_LINE] = 0x0; +for (r = 0; r PCI_NUM_REGIONS; ++r) { +if (!dev-io_regions[r].size) { +continue; +} +pci_set_long(dev-config + pci_bar(r), dev-io_regions[r].type); +} +pci_update_mappings(dev); } static void pci_bus_reset(void *opaque) -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
buildbot failure in qemu-kvm on default_i386_debian_5_0
The Buildbot has detected a new failure of default_i386_debian_5_0 on qemu-kvm. Full details are available at: http://buildbot.b1-systems.de/qemu-kvm/builders/default_i386_debian_5_0/builds/102 Buildbot URL: http://buildbot.b1-systems.de/qemu-kvm/ Buildslave for this Build: b1_qemu_kvm_2 Build Reason: Build Source Stamp: [branch master] HEAD Blamelist: =?UTF-8?q?Reimar=20D=C3=B6ffinger?= reimar.doeffin...@gmx.de,Amit Shah amit.s...@redhat.com,Andre Przywara andre.przyw...@amd.com,Anthony Liguori aligu...@us.ibm.com,Aurelien Jarno aurel...@aurel32.net,Avi Kivity a...@redhat.com,Blue Swirl blauwir...@gmail.com,Edgar E. Iglesias edgar.igles...@gmail.com,Gerd Hoffmann kra...@redhat.com,Glauber Costa glom...@mothafucka.localdomain,Glauber Costa glom...@redhat.com,Gleb Natapov g...@redhat.com,Jan Kiszka jan.kis...@siemens.com,Jan Kiszka jan.kis...@web.de,Juan Quintela quint...@redhat.com,Kevin Wolf kw...@redhat.com,Kevin Wolf m...@kevin-wolf.de,Laurent Desnogues laurent.desnog...@gmail.com,Luiz Capitulino lcapitul...@redhat.com,Marcelo Tosatti mtosa...@redhat.com,Markus Armbruster arm...@redhat.com,Michael S. Tsirkin m...@redhat.com,Paul Bolle pebo...@tiscali.nl,Stefan Weil w...@mail.berlios.de,Thomas Monjalon thomas...@monjalon.net,malc av1...@comtv.ru BUILD FAILED: failed compile sincerely, -The Buildbot -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
buildbot failure in qemu-kvm on disable_kvm_i386_out_of_tree
The Buildbot has detected a new failure of disable_kvm_i386_out_of_tree on qemu-kvm. Full details are available at: http://buildbot.b1-systems.de/qemu-kvm/builders/disable_kvm_i386_out_of_tree/builds/39 Buildbot URL: http://buildbot.b1-systems.de/qemu-kvm/ Buildslave for this Build: b1_qemu_kvm_2 Build Reason: Build Source Stamp: [branch master] HEAD Blamelist: =?UTF-8?q?Reimar=20D=C3=B6ffinger?= reimar.doeffin...@gmx.de,Amit Shah amit.s...@redhat.com,Andre Przywara andre.przyw...@amd.com,Anthony Liguori aligu...@us.ibm.com,Aurelien Jarno aurel...@aurel32.net,Avi Kivity a...@redhat.com,Blue Swirl blauwir...@gmail.com,Edgar E. Iglesias edgar.igles...@gmail.com,Gerd Hoffmann kra...@redhat.com,Glauber Costa glom...@mothafucka.localdomain,Glauber Costa glom...@redhat.com,Gleb Natapov g...@redhat.com,Jan Kiszka jan.kis...@siemens.com,Jan Kiszka jan.kis...@web.de,Juan Quintela quint...@redhat.com,Kevin Wolf kw...@redhat.com,Kevin Wolf m...@kevin-wolf.de,Laurent Desnogues laurent.desnog...@gmail.com,Luiz Capitulino lcapitul...@redhat.com,Marcelo Tosatti mtosa...@redhat.com,Markus Armbruster arm...@redhat.com,Michael S. Tsirkin m...@redhat.com,Paul Bolle pebo...@tiscali.nl,Stefan Weil w...@mail.berlios.de,Thomas Monjalon thomas...@monjalon.net,malc av1...@comtv.ru BUILD FAILED: failed compile sincerely, -The Buildbot -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
buildbot failure in qemu-kvm on disable_kvm_x86_64_debian_5_0
The Buildbot has detected a new failure of disable_kvm_x86_64_debian_5_0 on qemu-kvm. Full details are available at: http://buildbot.b1-systems.de/qemu-kvm/builders/disable_kvm_x86_64_debian_5_0/builds/90 Buildbot URL: http://buildbot.b1-systems.de/qemu-kvm/ Buildslave for this Build: b1_qemu_kvm_1 Build Reason: Build Source Stamp: [branch master] HEAD Blamelist: =?UTF-8?q?Reimar=20D=C3=B6ffinger?= reimar.doeffin...@gmx.de,Amit Shah amit.s...@redhat.com,Andre Przywara andre.przyw...@amd.com,Anthony Liguori aligu...@us.ibm.com,Aurelien Jarno aurel...@aurel32.net,Avi Kivity a...@redhat.com,Blue Swirl blauwir...@gmail.com,Edgar E. Iglesias edgar.igles...@gmail.com,Gerd Hoffmann kra...@redhat.com,Glauber Costa glom...@mothafucka.localdomain,Glauber Costa glom...@redhat.com,Gleb Natapov g...@redhat.com,Jan Kiszka jan.kis...@siemens.com,Jan Kiszka jan.kis...@web.de,Juan Quintela quint...@redhat.com,Kevin Wolf kw...@redhat.com,Kevin Wolf m...@kevin-wolf.de,Laurent Desnogues laurent.desnog...@gmail.com,Luiz Capitulino lcapitul...@redhat.com,Marcelo Tosatti mtosa...@redhat.com,Markus Armbruster arm...@redhat.com,Michael S. Tsirkin m...@redhat.com,Paul Bolle pebo...@tiscali.nl,Stefan Weil w...@mail.berlios.de,Thomas Monjalon thomas...@monjalon.net,malc av1...@comtv.ru BUILD FAILED: failed compile sincerely, -The Buildbot -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
buildbot failure in qemu-kvm on default_i386_out_of_tree
The Buildbot has detected a new failure of default_i386_out_of_tree on qemu-kvm. Full details are available at: http://buildbot.b1-systems.de/qemu-kvm/builders/default_i386_out_of_tree/builds/39 Buildbot URL: http://buildbot.b1-systems.de/qemu-kvm/ Buildslave for this Build: b1_qemu_kvm_2 Build Reason: Build Source Stamp: [branch master] HEAD Blamelist: =?UTF-8?q?Reimar=20D=C3=B6ffinger?= reimar.doeffin...@gmx.de,Amit Shah amit.s...@redhat.com,Andre Przywara andre.przyw...@amd.com,Anthony Liguori aligu...@us.ibm.com,Aurelien Jarno aurel...@aurel32.net,Avi Kivity a...@redhat.com,Blue Swirl blauwir...@gmail.com,Edgar E. Iglesias edgar.igles...@gmail.com,Gerd Hoffmann kra...@redhat.com,Glauber Costa glom...@mothafucka.localdomain,Glauber Costa glom...@redhat.com,Gleb Natapov g...@redhat.com,Jan Kiszka jan.kis...@siemens.com,Jan Kiszka jan.kis...@web.de,Juan Quintela quint...@redhat.com,Kevin Wolf kw...@redhat.com,Kevin Wolf m...@kevin-wolf.de,Laurent Desnogues laurent.desnog...@gmail.com,Luiz Capitulino lcapitul...@redhat.com,Marcelo Tosatti mtosa...@redhat.com,Markus Armbruster arm...@redhat.com,Michael S. Tsirkin m...@redhat.com,Paul Bolle pebo...@tiscali.nl,Stefan Weil w...@mail.berlios.de,Thomas Monjalon thomas...@monjalon.net,malc av1...@comtv.ru BUILD FAILED: failed git sincerely, -The Buildbot -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
buildbot failure in qemu-kvm on disable_kvm_x86_64_out_of_tree
The Buildbot has detected a new failure of disable_kvm_x86_64_out_of_tree on qemu-kvm. Full details are available at: http://buildbot.b1-systems.de/qemu-kvm/builders/disable_kvm_x86_64_out_of_tree/builds/39 Buildbot URL: http://buildbot.b1-systems.de/qemu-kvm/ Buildslave for this Build: b1_qemu_kvm_1 Build Reason: Build Source Stamp: [branch master] HEAD Blamelist: =?UTF-8?q?Reimar=20D=C3=B6ffinger?= reimar.doeffin...@gmx.de,Amit Shah amit.s...@redhat.com,Andre Przywara andre.przyw...@amd.com,Anthony Liguori aligu...@us.ibm.com,Aurelien Jarno aurel...@aurel32.net,Avi Kivity a...@redhat.com,Blue Swirl blauwir...@gmail.com,Edgar E. Iglesias edgar.igles...@gmail.com,Gerd Hoffmann kra...@redhat.com,Glauber Costa glom...@mothafucka.localdomain,Glauber Costa glom...@redhat.com,Gleb Natapov g...@redhat.com,Jan Kiszka jan.kis...@siemens.com,Jan Kiszka jan.kis...@web.de,Juan Quintela quint...@redhat.com,Kevin Wolf kw...@redhat.com,Kevin Wolf m...@kevin-wolf.de,Laurent Desnogues laurent.desnog...@gmail.com,Luiz Capitulino lcapitul...@redhat.com,Marcelo Tosatti mtosa...@redhat.com,Markus Armbruster arm...@redhat.com,Michael S. Tsirkin m...@redhat.com,Paul Bolle pebo...@tiscali.nl,Stefan Weil w...@mail.berlios.de,Thomas Monjalon thomas...@monjalon.net,malc av1...@comtv.ru BUILD FAILED: failed compile sincerely, -The Buildbot -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem booting guest with Linux 2.6.3x
Hi, Alex. On Tuesday, 06 October 2009 22:40:11 -0600, Alex Williamson wrote: root (hd0,1) Filesystem type is ext2fs, partition type 0x83 kernel /boot/vmlinuz-2.6.31.2-dgb root=/dev/hda2 ro quiet console=tty0 console =ttyS0,38400n8 [Linux-bzImage, setup=0x3600, size=0x203480] initrd /boot/initrd.img-2.6.31.2-dgb [Linux-initrd @ 0x1f983000, 0x65c455 bytes] Loading, please wait... WARNING bootdevice may be renamed. Try root=/dev/sda2 I think if you boot without the quiet option you'll see that your guest IDE disk did in fact get installed as /dev/sda and following the advice of the error message above will allow you to boot the guest. I'm using the option quiet with both stock kernel and the kernel compiled by myself. You could boot using the uuid of the partition or label the filesystem to avoid device naming issues between your original lenny kernel and the newer kernel. I was trying changing the not swap devices to the uuid form. Although in this case the swap device was not detected, the guest boots without majors problems. I think that being used the QEMU_HARDDISK names provided by udevinfo would have been solved this problem. But according to it seems, I could verify that the disks that are passed with -hdX in KVM-88 are mapped in 2.6.31.2 guests like SATA/SCSI devices. With Linux stock 2.6.26 these are mapped like IDE disks. Can it be due to some change in the kernel code related with KVM? Thanks for your reply. Regards, Daniel -- Fingerprint: BFB3 08D6 B4D1 31B2 72B9 29CE 6696 BF1B 14E6 1D37 Powered by Debian GNU/Linux Squeeze - Linux user #188.598 signature.asc Description: Digital signature
Re: Problem booting guest with Linux 2.6.3x
Daniel Bareiro wrote: Hi, Alex. On Tuesday, 06 October 2009 22:40:11 -0600, Alex Williamson wrote: root (hd0,1) Filesystem type is ext2fs, partition type 0x83 kernel /boot/vmlinuz-2.6.31.2-dgb root=/dev/hda2 ro quiet console=tty0 console =ttyS0,38400n8 [Linux-bzImage, setup=0x3600, size=0x203480] initrd /boot/initrd.img-2.6.31.2-dgb [Linux-initrd @ 0x1f983000, 0x65c455 bytes] Loading, please wait... WARNING bootdevice may be renamed. Try root=/dev/sda2 I think if you boot without the quiet option you'll see that your guest IDE disk did in fact get installed as /dev/sda and following the advice of the error message above will allow you to boot the guest. I'm using the option quiet with both stock kernel and the kernel compiled by myself. It's irrelevant. By using quiet you're hiding the details, that's what it is about -- what's what Alex is saying. You could boot using the uuid of the partition or label the filesystem to avoid device naming issues between your original lenny kernel and the newer kernel. I was trying changing the not swap devices to the uuid form. Although in this case the swap device was not detected, the guest boots without majors problems. I think that being used the QEMU_HARDDISK names provided by udevinfo would have been solved this problem. But according to it seems, I could verify that the disks that are passed with -hdX in KVM-88 are mapped in 2.6.31.2 guests like SATA/SCSI devices. With Linux stock 2.6.26 these are mapped like IDE disks. Can it be due to some change in the kernel code related with KVM? It has nothing to do with kvm. It's different kernel options, all kernels since very early 2.6.x are able to see ide disks as hdX or sdX, depending on the kernel options and modules loaded. There are 2 drivers for each IDE controller - IDE/ATA one, which creates hdX, and PATA one which creates sdX. /mjt -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/3] qemu-kvm: fix build with KVM_CAP_SET_GUEST_DEBUG
Fix build with KVM_CAP_SET_GUEST_DEBUG: use QLIST macro to declare list head. Signed-off-by: Michael S. Tsirkin m...@redhat.com --- qemu-kvm.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/qemu-kvm.h b/qemu-kvm.h index 4523e25..d6748c7 100644 --- a/qemu-kvm.h +++ b/qemu-kvm.h @@ -1229,7 +1229,7 @@ typedef struct KVMState { int broken_set_mem_region; int migration_log; #ifdef KVM_CAP_SET_GUEST_DEBUG -struct kvm_sw_breakpoint_head kvm_sw_breakpoints; +QTAILQ_HEAD(, kvm_sw_breakpoint) kvm_sw_breakpoints; #endif struct kvm_context kvm_context; } KVMState; -- 1.6.5.rc2 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/3] qemu-kvm: fix build on 32 bit
Fix build on 32 bit system: cast 64 bit integer to pointer through pointer-sized integer. Without this, I get: qemu-kvm.c:1557: error: cast to pointer from integer of different size Signed-off-by: Michael S. Tsirkin m...@redhat.com --- qemu-kvm.c |5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-) diff --git a/qemu-kvm.c b/qemu-kvm.c index a4a90ed..62ca050 100644 --- a/qemu-kvm.c +++ b/qemu-kvm.c @@ -588,7 +588,7 @@ int kvm_register_phys_mem(kvm_context_t kvm, struct kvm_userspace_memory_region memory = { .memory_size = len, .guest_phys_addr = phys_start, -.userspace_addr = (unsigned long) (intptr_t) userspace_addr, +.userspace_addr = (unsigned long) (uintptr_t) userspace_addr, .flags = log ? KVM_MEM_LOG_DIRTY_PAGES : 0, }; int r; @@ -1554,7 +1554,8 @@ static void sigbus_handler(int n, struct qemu_signalfd_siginfo *siginfo, CPUState *cenv; /* Hope we are lucky for AO MCE */ -if (do_qemu_ram_addr_from_host((void *)siginfo-ssi_addr, paddr)) { +if (do_qemu_ram_addr_from_host((void *)(intptr_t)siginfo-ssi_addr, + paddr)) { fprintf(stderr, Hardware memory error for memory used by QEMU itself instead of guest system!: %llx\n, (unsigned long long)siginfo-ssi_addr); -- 1.6.5.rc2 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] qemu-kvm: convert kvm_types to ISO
Convert kvm-types to use ISO C types so that it can be included independently of other headers. This is on top of header patch set I sent previously. Signed-off-by: Michael S. Tsirkin m...@redhat.com --- kvm/include/linux/kvm_types.h | 50 1 files changed, 25 insertions(+), 25 deletions(-) diff --git a/kvm/include/linux/kvm_types.h b/kvm/include/linux/kvm_types.h index c65f89e..5c0b739 100644 --- a/kvm/include/linux/kvm_types.h +++ b/kvm/include/linux/kvm_types.h @@ -70,41 +70,41 @@ * hfn - host frame number */ -typedef unsigned long gva_t; -typedef u64gpa_t; -typedef unsigned long gfn_t; +typedef unsigned long gva_t; +typedef unsigned long long gpa_t; +typedef unsigned long gfn_t; -typedef unsigned long hva_t; -typedef u64hpa_t; -typedef unsigned long hfn_t; +typedef unsigned long hva_t; +typedef unsigned long long hpa_t; +typedef unsigned long hfn_t; typedef hfn_t pfn_t; union kvm_ioapic_redirect_entry { - u64 bits; + unsigned long long bits; struct { - u8 vector; - u8 delivery_mode:3; - u8 dest_mode:1; - u8 delivery_status:1; - u8 polarity:1; - u8 remote_irr:1; - u8 trig_mode:1; - u8 mask:1; - u8 reserve:7; - u8 reserved[4]; - u8 dest_id; + unsigned char vector; + unsigned char delivery_mode:3; + unsigned char dest_mode:1; + unsigned char delivery_status:1; + unsigned char polarity:1; + unsigned char remote_irr:1; + unsigned char trig_mode:1; + unsigned char mask:1; + unsigned char reserve:7; + unsigned char reserved[4]; + unsigned char dest_id; } fields; }; struct kvm_lapic_irq { - u32 vector; - u32 delivery_mode; - u32 dest_mode; - u32 level; - u32 trig_mode; - u32 shorthand; - u32 dest_id; + unsigned vector; + unsigned delivery_mode; + unsigned dest_mode; + unsigned level; + unsigned trig_mode; + unsigned shorthand; + unsigned dest_id; }; #endif /* __KVM_TYPES_H__ */ -- 1.6.5.rc2 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/3] qemu-kvm: build fixes
Misc fixes for build on a 32 bit system Signed-off-by: Michael S. Tsirkin m...@redhat.com Michael S. Tsirkin (3): qemu-kvm: fix build with KVM_CAP_SET_GUEST_DEBUG qemu-kvm: fix build on 32 bit qemu-kvm: convert kvm_types to ISO kvm/include/linux/kvm_types.h | 50 qemu-kvm.c|5 ++- qemu-kvm.h|2 +- 3 files changed, 29 insertions(+), 28 deletions(-) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Network packet size over MTU in guests
Fabrice Toppi wrote: Hi, We use KVM 75 on Gentoo hosts with kernel 2.6.27.8 with a variety of guests (Windows XP, 2000, 2003 several Linux distributions) in a production setup. The hosts are Intel Xeon based. We use bridged networking with E1000 model inside of guests. Recently, we began to have network problem (slowdowns, lags, timeouts) on several guests. Inside those guests, we saw with tcpdump that there were packets with a size well over MTU. On the host, tcpdump on the guest tap did show packets of 1500 bytes and that's all. For some of the guests (Debian based), changing the NIC model to rtl8139 instead of e1000 fixed the problem, but for others (Gentoo kernel 2.6.27.19) it didn't. Would it be a glitch in the KVM code or a driver problem ? Does this ring a bell to anyone ? I'm not sure anymore but I think it's very similar to what I had here with pre-78 (approx again, since it was quite some time ago) versions of kvm compiled with older kernel headers (without appropriate if_tun.h definitions). I'd say try to reproduce it with 0.11.0 kvm userspace compiled against 2.6.27+ kernel headers. But yet again: kvm-75 is quite old by now, so it's difficult to remember all the details. /mjt -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] qemu-kvm: fix build with KVM_CAP_SET_GUEST_DEBUG
Michael S. Tsirkin wrote: Fix build with KVM_CAP_SET_GUEST_DEBUG: use QLIST macro to declare list head. Signed-off-by: Michael S. Tsirkin m...@redhat.com --- qemu-kvm.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/qemu-kvm.h b/qemu-kvm.h index 4523e25..d6748c7 100644 --- a/qemu-kvm.h +++ b/qemu-kvm.h @@ -1229,7 +1229,7 @@ typedef struct KVMState { int broken_set_mem_region; int migration_log; #ifdef KVM_CAP_SET_GUEST_DEBUG -struct kvm_sw_breakpoint_head kvm_sw_breakpoints; +QTAILQ_HEAD(, kvm_sw_breakpoint) kvm_sw_breakpoints; #endif struct kvm_context kvm_context; } KVMState; If it's required here I bet we need this upstream too, right? Then please also file a corresponding patch for qemu. Thanks, Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] qemu-kvm: fix build with KVM_CAP_SET_GUEST_DEBUG
On Wed, Oct 07, 2009 at 02:16:32PM +0200, Jan Kiszka wrote: Michael S. Tsirkin wrote: Fix build with KVM_CAP_SET_GUEST_DEBUG: use QLIST macro to declare list head. Signed-off-by: Michael S. Tsirkin m...@redhat.com --- qemu-kvm.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/qemu-kvm.h b/qemu-kvm.h index 4523e25..d6748c7 100644 --- a/qemu-kvm.h +++ b/qemu-kvm.h @@ -1229,7 +1229,7 @@ typedef struct KVMState { int broken_set_mem_region; int migration_log; #ifdef KVM_CAP_SET_GUEST_DEBUG -struct kvm_sw_breakpoint_head kvm_sw_breakpoints; +QTAILQ_HEAD(, kvm_sw_breakpoint) kvm_sw_breakpoints; #endif struct kvm_context kvm_context; } KVMState; If it's required here I bet we need this upstream too, right? No, upstream does not have qemu-kvm.h All these files are qemu-kvm only. upstream builds fine as is. Then please also file a corresponding patch for qemu. Thanks, Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] qemu-kvm: fix build with KVM_CAP_SET_GUEST_DEBUG
Michael S. Tsirkin wrote: On Wed, Oct 07, 2009 at 02:16:32PM +0200, Jan Kiszka wrote: Michael S. Tsirkin wrote: Fix build with KVM_CAP_SET_GUEST_DEBUG: use QLIST macro to declare list head. Signed-off-by: Michael S. Tsirkin m...@redhat.com --- qemu-kvm.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/qemu-kvm.h b/qemu-kvm.h index 4523e25..d6748c7 100644 --- a/qemu-kvm.h +++ b/qemu-kvm.h @@ -1229,7 +1229,7 @@ typedef struct KVMState { int broken_set_mem_region; int migration_log; #ifdef KVM_CAP_SET_GUEST_DEBUG -struct kvm_sw_breakpoint_head kvm_sw_breakpoints; +QTAILQ_HEAD(, kvm_sw_breakpoint) kvm_sw_breakpoints; #endif struct kvm_context kvm_context; } KVMState; If it's required here I bet we need this upstream too, right? No, upstream does not have qemu-kvm.h All these files are qemu-kvm only. upstream builds fine as is. Upstream has KVMState, too, but it also has a cleaner header structuring than qemu-kvm (as the latter is morphing towards to former). So yes, this is a qemu-kvm-only workaround. Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: migrate_set_downtime bug
On Wed, Oct 07, 2009 at 06:42:48AM +0200, Dietmar Maurer wrote: The default downtime is set to 30ms. This value triggers the convergence problem quite often. Maybe a longer default is more reasonable. What do you feel about 100 ms? What is the reasoning behind such short downtimes? Are there any application that will fail with longer downtimes (let say 1s)? Note: on a 1Gbit/s net you can transfer only 10MB within 100ms which accounts for more than 2 thousand pages, which sounds like enough for a first pass to me. For the default case, It is hard to imagine an application dirtying more than 2k pages per-iteration -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/4] KVM: introduce xinterface API for external interaction with guests
Avi Kivity wrote: On 10/06/2009 09:40 PM, Gregory Haskins wrote: Thinking about this some more over lunch, I think we (Avi and I) might both be wrong (and David is right). Avi is right that we don't need rmb() or barrier() for the reasons already stated, but I think David is right that we need an smp_mb() to ensure the cpu doesn't do the reordering. Otherwise a different cpu could invalidate the memory if it reuses the freed memory in the meantime, iiuc. IOW: its not a compiler issue but a cpu issue. Or am I still confused? The sequence of operations is: v = p-v; f(); // rmb() ? g(v); You are worried that the compiler No or cpu will fetch p-v after f() has executed? Yes. The compiler may not, since it can't tell whether f() might change p-v. Right, you were correct to say my barrier() suggestion was wrong. If f() can cause another agent to write to p (by freeing it to a global list, for example), then it is its responsibility to issue the smp_rmb(), otherwise no calculation that took place before f() and accessed p is safe. IOW: David is right. You need a cpu-barrier one way or the other. We can either allow -release() to imply one (and probably document it that way, like we did for slow-work), or we can be explicit. I chose to be explicit since it is kind of self-documenting, and there is no need to be worried about performance since the release is slow-path. OTOH: If you feel strongly about it, we can take it out, knowing that most anything the properly invalidates the memory will likely include an implicit barrier of some kind. Kind Regards, -Greg signature.asc Description: OpenPGP digital signature
Re: sync guest calls made async on host - SQLite performance
I now have more information. Dustin, The version used was 0.11.0-rc2, from the 2009-09-11 karmic daily build. The VM identifies itself as AMD QEMU Virtual CPU version 0.10.92 stepping 03. When you indicated that you had attempted to reproduce the problem, what mechanism did you use? Was it Karmic + KVM as the host and Karmic as the guest? What test did you use? I will re-open the launchpad bug if you believe it makes sense to continue the discussions there. Anthony, If you can suspend your disbelief for a short while and ask questions to clarify the details. My only interest here is to understand the results presented by the benchmark and determine if there are data integrity risks. Fundamentally, if there are modes of operation that applications can get a considerable performance boost by running the same OS under KVM then there will be lots of people happy. But realistically it is an indication of something wrong, misconfigured or just broken it bears at least some discussion. Bear in mind that upstream is relevant for KVM, but for distributions shipping KVM, they may have secondary concerns about patchesets and upstream changes that may be relevant for how they support their customers. Regards, Matthew Original Message Subject: Re: sync guest calls made async on host - SQLite performance From: Anthony Liguori anth...@codemonkey.ws To: Matthew Tippett tippe...@gmail.com Cc: Avi Kivity a...@redhat.com, RW k...@tauceti.net, kvm@vger.kernel.org Date: 09/29/2009 04:51 PM Matthew Tippett wrote: Your confidence is misplaced apparently. and I have pieced together the following information. I should be able to get the actual daily build number but broadly it looks like it was Ubuntu 9.10 daily snapshot (~ 9th - 21st September) Linux 2.6.31 (packaged as 2.6.31-10.30 to 2.6.31-10.32) qemu-kvm 0.11 (packaged as 0.11.0~rc2-0ubuntu to 0.11.0~rc2-0ubuntu5 That's extremely unlikely. But, if it turned out to be Ubuntu 9.10, linux 2.6.31, qemu-kvm 0.11 would there be any concerns? It's not relevant because it's not qemu-kvm-0.11. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 02/10] KVM: X86: Add KVM_REQ_VMEXIT to trigger a nested #vmexit
In the vcpu_run path the kernel may notice that a #vmexit is necessary when preemption is already disabled. In the SVM code an emulation of #vmexit may sleep and can't be executed with preemtion disabled. This patch begins to solve this problem by defining a KVM_REQ_VMEXIT bit. When this bit is set the vcpu_run loop is restarted and a #vmexit is emulated. Signed-off-by: Joerg Roedel joerg.roe...@amd.com --- arch/x86/include/asm/kvm_host.h |1 + arch/x86/kvm/x86.c | 17 + include/linux/kvm_host.h|1 + 3 files changed, 19 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 179a919..50e5aa4 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -524,6 +524,7 @@ struct kvm_x86_ops { int (*get_tdp_level)(void); u64 (*get_mt_mask)(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio); bool (*gb_page_enable)(void); + void (*emulate_vmexit)(struct kvm_vcpu *vcpu); const struct trace_print_flags *exit_reasons_str; }; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 11a6f2f..97e1d9d 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3610,6 +3610,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) r = 0; goto out; } + if (test_and_clear_bit(KVM_REQ_VMEXIT, vcpu-requests)) +kvm_x86_ops-emulate_vmexit(vcpu); } preempt_disable(); @@ -3638,6 +3640,21 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) else if (kvm_cpu_has_interrupt(vcpu) || req_int_win) kvm_x86_ops-enable_irq_window(vcpu); + /* +* With nested KVM the enable_irq_window() function may cause an +* #vmexit if the vcpu is running in guest mode. A #vmexit may sleep +* and can't be executed at this stage. So we use the request field to +* tell KVM that a #vmexit has to be done before we can enter the guest +* again. The code below checks for this request. +*/ + if (vcpu-requests) { + local_irq_enable(); + preempt_enable(); + r = 1; + goto out; + } + + if (kvm_lapic_enabled(vcpu)) { update_cr8_intercept(vcpu); kvm_lapic_sync_to_vapic(vcpu); diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index b985a29..245463f 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -38,6 +38,7 @@ #define KVM_REQ_MMU_SYNC 7 #define KVM_REQ_KVMCLOCK_UPDATE8 #define KVM_REQ_KICK 9 +#define KVM_REQ_VMEXIT 10 #define KVM_USERSPACE_IRQ_SOURCE_ID0 -- 1.6.4.3 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 08/10] KVM: SVM: Add tracepoint for invlpga instruction
This patch adds a tracepoint for the event that the guest executed the INVLPGA instruction. Signed-off-by: Joerg Roedel joerg.roe...@amd.com --- arch/x86/kvm/svm.c |3 +++ arch/x86/kvm/trace.h | 23 +++ arch/x86/kvm/x86.c |1 + 3 files changed, 27 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index a03dba1..563ddd3 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -1973,6 +1973,9 @@ static int invlpga_interception(struct vcpu_svm *svm) struct kvm_vcpu *vcpu = svm-vcpu; nsvm_printk(INVLPGA\n); + trace_kvm_invlpga(svm-vmcb-save.rip, vcpu-arch.regs[VCPU_REGS_RCX], + vcpu-arch.regs[VCPU_REGS_RAX]); + /* Let's treat INVLPGA the same as INVLPG (can be optimized!) */ kvm_mmu_invlpg(vcpu, vcpu-arch.regs[VCPU_REGS_RAX]); diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h index 20248a1..a6dcd2d 100644 --- a/arch/x86/kvm/trace.h +++ b/arch/x86/kvm/trace.h @@ -469,6 +469,29 @@ TRACE_EVENT(kvm_nested_intr_vmexit, TP_printk(rip=0x%016llx\n, __entry-rip) ); + +/* + * Tracepoint for nested #vmexit because of interrupt pending + */ +TRACE_EVENT(kvm_invlpga, + TP_PROTO(__u64 rip, int asid, u64 address), + TP_ARGS(rip, asid, address), + + TP_STRUCT__entry( + __field(__u64, rip ) + __field(int,asid) + __field(__u64, address ) + ), + + TP_fast_assign( + __entry-rip= rip; + __entry-asid = asid; + __entry-address= address; + ), + + TP_printk(rip=0x%016llx asid=%d adress=0x%016llx\n, + __entry-rip, __entry-asid, __entry-address) +); #endif /* _TRACE_KVM_H */ /* This part must be outside protection */ diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index e0a8517..18284e1 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -5001,3 +5001,4 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_nested_vmrun); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_nested_vmexit); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_nested_vmexit_inject); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_nested_intr_vmexit); +EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_invlpga); -- 1.6.4.3 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 09/10] KVM: SVM: Add tracepoint for skinit instruction
This patch adds a tracepoint for the event that the guest executed the SKINIT instruction. This information is important because SKINIT is an SVM extenstion not yet implemented by nested SVM and we may need this information for debugging hypervisors that do not yet run on nested SVM. Signed-off-by: Joerg Roedel joerg.roe...@amd.com --- arch/x86/kvm/svm.c | 10 +- arch/x86/kvm/trace.h | 22 ++ arch/x86/kvm/x86.c |1 + 3 files changed, 32 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 563ddd3..5082558 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -1984,6 +1984,14 @@ static int invlpga_interception(struct vcpu_svm *svm) return 1; } +static int skinit_interception(struct vcpu_svm *svm) +{ + trace_kvm_skinit(svm-vmcb-save.rip, svm-vcpu.arch.regs[VCPU_REGS_RAX]); + + kvm_queue_exception(svm-vcpu, UD_VECTOR); + return 1; +} + static int invalid_op_interception(struct vcpu_svm *svm) { kvm_queue_exception(svm-vcpu, UD_VECTOR); @@ -2347,7 +2355,7 @@ static int (*svm_exit_handlers[])(struct vcpu_svm *svm) = { [SVM_EXIT_VMSAVE] = vmsave_interception, [SVM_EXIT_STGI] = stgi_interception, [SVM_EXIT_CLGI] = clgi_interception, - [SVM_EXIT_SKINIT] = invalid_op_interception, + [SVM_EXIT_SKINIT] = skinit_interception, [SVM_EXIT_WBINVD] = emulate_on_interception, [SVM_EXIT_MONITOR] = invalid_op_interception, [SVM_EXIT_MWAIT]= invalid_op_interception, diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h index a6dcd2d..7948b49 100644 --- a/arch/x86/kvm/trace.h +++ b/arch/x86/kvm/trace.h @@ -492,6 +492,28 @@ TRACE_EVENT(kvm_invlpga, TP_printk(rip=0x%016llx asid=%d adress=0x%016llx\n, __entry-rip, __entry-asid, __entry-address) ); + +/* + * Tracepoint for nested #vmexit because of interrupt pending + */ +TRACE_EVENT(kvm_skinit, + TP_PROTO(__u64 rip, __u32 slb), + TP_ARGS(rip, slb), + + TP_STRUCT__entry( + __field(__u64, rip ) + __field(__u32, slb ) + ), + + TP_fast_assign( + __entry-rip= rip; + __entry-slb= slb; + ), + + TP_printk(rip=0x%016llx slb=0x%08x\n, + __entry-rip, __entry-slb) +); + #endif /* _TRACE_KVM_H */ /* This part must be outside protection */ diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 18284e1..8043d76 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -5002,3 +5002,4 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_nested_vmexit); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_nested_vmexit_inject); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_nested_intr_vmexit); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_invlpga); +EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_skinit); -- 1.6.4.3 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 05/10] KVM: SVM: Add tracepoint for nested #vmexit
This patch adds a tracepoint for every #vmexit we get from a nested guest. Signed-off-by: Joerg Roedel joerg.roe...@amd.com --- arch/x86/kvm/svm.c |6 ++ arch/x86/kvm/trace.h | 36 arch/x86/kvm/x86.c |1 + 3 files changed, 43 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 8de84be..e759732 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -2355,6 +2355,12 @@ static int handle_exit(struct kvm_vcpu *vcpu) if (is_nested(svm)) { int vmexit; + trace_kvm_nested_vmexit(svm-vmcb-save.rip, exit_code, + svm-vmcb-control.exit_info_1, + svm-vmcb-control.exit_info_2, + svm-vmcb-control.exit_int_info, + svm-vmcb-control.exit_int_info_err); + nsvm_printk(nested handle_exit: 0x%x | 0x%lx | 0x%lx | 0x%lx\n, exit_code, svm-vmcb-control.exit_info_1, svm-vmcb-control.exit_info_2, svm-vmcb-save.rip); diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h index d63272c..a0b89c3 100644 --- a/arch/x86/kvm/trace.h +++ b/arch/x86/kvm/trace.h @@ -382,6 +382,42 @@ TRACE_EVENT(kvm_nested_vmrun, __entry-npt ? on : off) ); +/* + * Tracepoint for #VMEXIT while nested + */ +TRACE_EVENT(kvm_nested_vmexit, + TP_PROTO(__u64 rip, __u32 exit_code, +__u64 exit_info1, __u64 exit_info2, +__u32 exit_int_info, __u32 exit_int_info_err), + TP_ARGS(rip, exit_code, exit_info1, exit_info2, + exit_int_info, exit_int_info_err), + + TP_STRUCT__entry( + __field(__u64, rip ) + __field(__u32, exit_code ) + __field(__u64, exit_info1 ) + __field(__u64, exit_info2 ) + __field(__u32, exit_int_info ) + __field(__u32, exit_int_info_err ) + ), + + TP_fast_assign( + __entry-rip= rip; + __entry-exit_code = exit_code; + __entry-exit_info1 = exit_info1; + __entry-exit_info2 = exit_info2; + __entry-exit_int_info = exit_int_info; + __entry-exit_int_info_err = exit_int_info_err; + ), + TP_printk(rip=0x%016llx reason=%s ext_inf1=0x%016llx + ext_inf2=0x%016llx ext_int=0x%08x ext_int_err=0x%08x\n, + __entry-rip, + ftrace_print_symbols_seq(p, __entry-exit_code, + kvm_x86_ops-exit_reasons_str), + __entry-exit_info1, __entry-exit_info2, + __entry-exit_int_info, __entry-exit_int_info_err) +); + #endif /* _TRACE_KVM_H */ /* This part must be outside protection */ diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index b51a824..416282e 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4998,3 +4998,4 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_page_fault); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_msr); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_cr); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_nested_vmrun); +EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_nested_vmexit); -- 1.6.4.3 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 01/10] KVM: SVM: Notify nested hypervisor of lost event injections
From: Alexander Graf ag...@suse.de If event_inj is valid on a #vmexit the host CPU would write the contents to exit_int_info, so the hypervisor knows that the event wasn't injected. We don't do this in nested SVM by now which is a bug and fixed by this patch. Signed-off-by: Alexander Graf ag...@suse.de Signed-off-by: Joerg Roedel joerg.roe...@amd.com --- arch/x86/kvm/svm.c | 16 1 files changed, 16 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 279a2ae..b6ce1a9 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -1615,6 +1615,22 @@ static int nested_svm_vmexit(struct vcpu_svm *svm) nested_vmcb-control.exit_info_2 = vmcb-control.exit_info_2; nested_vmcb-control.exit_int_info = vmcb-control.exit_int_info; nested_vmcb-control.exit_int_info_err = vmcb-control.exit_int_info_err; + + /* +* If we emulate a VMRUN/#VMEXIT in the same host #vmexit cycle we have +* to make sure that we do not lose injected events. So check event_inj +* here and copy it to exit_int_info if it is valid. +* Exit_int_info and event_inj can't be both valid because the case +* below case only happens on a VMRUN instruction intercept which has +* no valid exit_int_info set. +*/ + if (vmcb-control.event_inj SVM_EVTINJ_VALID) { + struct vmcb_control_area *nc = nested_vmcb-control; + + nc-exit_int_info = vmcb-control.event_inj; + nc-exit_int_info_err = vmcb-control.event_inj_err; + } + nested_vmcb-control.tlb_ctl = 0; nested_vmcb-control.event_inj = 0; nested_vmcb-control.event_inj_err = 0; -- 1.6.4.3 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/10] KVM: Nested SVM fixes and tracepoint conversion
Hi Avi, Marcelo, this series of patches contains bugfixes for the Nested SVM code and the conversion of Nested SVM debugging to tracepoints. The fixes are: 1) A patch Alex already sent (1/10) but which was not yet applied. It fixes a lost event_inj problem when we emulate a vmrun and a vmexit without entering the guest in the meantime. 2) The patches 2/10 and 3/10 fixing a schedule() while atomic bug in the Nested SVM code. The KVM interrupt injection code runs with preemtion and interrupts disabled. But the enable_irq_window() function from SVM may emulate a #vmexit. This emulation migth sleep which causes the schedule() while atomic() bug. These fixes (patches 1 to 3) should also be considered for -stable backporting. The patches 3 to 9 convert the old printk based debugging for Nested SVM to tracepoints. Patch 10 removes the nsvm_printk code. Please review and/or consider to apply these changes. Thanks, Joerg diffstat: arch/x86/include/asm/kvm_host.h |1 + arch/x86/kvm/svm.c | 98 +++- arch/x86/kvm/trace.h| 165 +++ arch/x86/kvm/x86.c | 23 ++ include/linux/kvm_host.h|1 + 5 files changed, 252 insertions(+), 36 deletions(-) shortlog: Alexander Graf (1): KVM: SVM: Notify nested hypervisor of lost event injections Joerg Roedel (9): KVM: X86: Add KVM_REQ_VMEXIT to trigger a nested #vmexit KVM: SVM: Move nested INTR #vmexit into preemtible code KVM: SVM: Add tracepoint for nested vmrun KVM: SVM: Add tracepoint for nested #vmexit KVM: SVM: Add tracepoint for injected #vmexit KVM: SVM: Add tracepoint for #vmexit because intr pending KVM: SVM: Add tracepoint for invlpga instruction KVM: SVM: Add tracepoint for skinit instruction KVM: SVM: Remove nsvm_printk debugging code -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 04/10] KVM: SVM: Add tracepoint for nested vmrun
This patch adds a dedicated kvm tracepoint for a nested vmrun. Signed-off-by: Joerg Roedel joerg.roe...@amd.com --- arch/x86/kvm/svm.c |6 ++ arch/x86/kvm/trace.h | 33 + arch/x86/kvm/x86.c |1 + 3 files changed, 40 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 7015680..8de84be 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -1722,6 +1722,12 @@ static bool nested_svm_vmrun(struct vcpu_svm *svm) /* nested_vmcb is our indicator if nested SVM is activated */ svm-nested.vmcb = svm-vmcb-save.rax; + trace_kvm_nested_vmrun(svm-vmcb-save.rip - 3, svm-nested.vmcb, + nested_vmcb-save.rip, + nested_vmcb-control.int_ctl, + nested_vmcb-control.event_inj, + nested_vmcb-control.nested_ctl); + /* Clear internal status */ kvm_clear_exception_queue(svm-vcpu); kvm_clear_interrupt_queue(svm-vcpu); diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h index 0d480e7..d63272c 100644 --- a/arch/x86/kvm/trace.h +++ b/arch/x86/kvm/trace.h @@ -349,6 +349,39 @@ TRACE_EVENT(kvm_apic_accept_irq, __entry-coalesced ? (coalesced) : ) ); +/* + * Tracepoint for nested VMRUN + */ +TRACE_EVENT(kvm_nested_vmrun, + TP_PROTO(__u64 rip, __u64 vmcb, __u64 nested_rip, __u32 int_ctl, +__u32 event_inj, bool npt), + TP_ARGS(rip, vmcb, nested_rip, int_ctl, event_inj, npt), + + TP_STRUCT__entry( + __field(__u64, rip ) + __field(__u64, vmcb) + __field(__u64, nested_rip ) + __field(__u32, int_ctl ) + __field(__u32, event_inj ) + __field(bool, npt ) + ), + + TP_fast_assign( + __entry-rip= rip; + __entry-vmcb = vmcb; + __entry-nested_rip = nested_rip; + __entry-int_ctl= int_ctl; + __entry-event_inj = event_inj; + __entry-npt= npt; + ), + + TP_printk(rip=0x%016llx vmcb=0x%016llx nrip=0x%016llx int_ctl=0x%08x + event_inj=0x%08x npt=%s\n, + __entry-rip, __entry-vmcb, __entry-nested_rip, + __entry-int_ctl, __entry-event_inj, + __entry-npt ? on : off) +); + #endif /* _TRACE_KVM_H */ /* This part must be outside protection */ diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 97e1d9d..b51a824 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4997,3 +4997,4 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_inj_virq); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_page_fault); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_msr); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_cr); +EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_nested_vmrun); -- 1.6.4.3 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 06/10] KVM: SVM: Add tracepoint for injected #vmexit
This patch adds a tracepoint for a nested #vmexit that gets re-injected to the guest. Signed-off-by: Joerg Roedel joerg.roe...@amd.com --- arch/x86/kvm/svm.c |6 ++ arch/x86/kvm/trace.h | 33 + arch/x86/kvm/x86.c |1 + 3 files changed, 40 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index e759732..4de75e3 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -1588,6 +1588,12 @@ static int nested_svm_vmexit(struct vcpu_svm *svm) struct vmcb *hsave = svm-nested.hsave; struct vmcb *vmcb = svm-vmcb; + trace_kvm_nested_vmexit_inject(vmcb-control.exit_code, + vmcb-control.exit_info_1, + vmcb-control.exit_info_2, + vmcb-control.exit_int_info, + vmcb-control.exit_int_info_err); + nested_vmcb = nested_svm_map(svm, svm-nested.vmcb, KM_USER0); if (!nested_vmcb) return 1; diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h index a0b89c3..a00b235 100644 --- a/arch/x86/kvm/trace.h +++ b/arch/x86/kvm/trace.h @@ -418,6 +418,39 @@ TRACE_EVENT(kvm_nested_vmexit, __entry-exit_int_info, __entry-exit_int_info_err) ); +/* + * Tracepoint for #VMEXIT reinjected to the guest + */ +TRACE_EVENT(kvm_nested_vmexit_inject, + TP_PROTO(__u32 exit_code, +__u64 exit_info1, __u64 exit_info2, +__u32 exit_int_info, __u32 exit_int_info_err), + TP_ARGS(exit_code, exit_info1, exit_info2, + exit_int_info, exit_int_info_err), + + TP_STRUCT__entry( + __field(__u32, exit_code ) + __field(__u64, exit_info1 ) + __field(__u64, exit_info2 ) + __field(__u32, exit_int_info ) + __field(__u32, exit_int_info_err ) + ), + + TP_fast_assign( + __entry-exit_code = exit_code; + __entry-exit_info1 = exit_info1; + __entry-exit_info2 = exit_info2; + __entry-exit_int_info = exit_int_info; + __entry-exit_int_info_err = exit_int_info_err; + ), + + TP_printk(reason=%s ext_inf1=0x%016llx + ext_inf2=0x%016llx ext_int=0x%08x ext_int_err=0x%08x\n, + ftrace_print_symbols_seq(p, __entry-exit_code, + kvm_x86_ops-exit_reasons_str), + __entry-exit_info1, __entry-exit_info2, + __entry-exit_int_info, __entry-exit_int_info_err) +); #endif /* _TRACE_KVM_H */ /* This part must be outside protection */ diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 416282e..c7ea2d8 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4999,3 +4999,4 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_msr); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_cr); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_nested_vmrun); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_nested_vmexit); +EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_nested_vmexit_inject); -- 1.6.4.3 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 07/10] KVM: SVM: Add tracepoint for #vmexit because intr pending
This patch adds a special tracepoint for the event that a nested #vmexit is injected because kvm wants to inject an interrupt into the guest. Signed-off-by: Joerg Roedel joerg.roe...@amd.com --- arch/x86/kvm/svm.c |1 + arch/x86/kvm/trace.h | 18 ++ arch/x86/kvm/x86.c |1 + 3 files changed, 20 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 4de75e3..a03dba1 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -1387,6 +1387,7 @@ static inline int nested_svm_intr(struct vcpu_svm *svm) * the #vmexit here. */ set_bit(KVM_REQ_VMEXIT, svm-vcpu.requests); + trace_kvm_nested_intr_vmexit(svm-vmcb-save.rip); return 1; } diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h index a00b235..20248a1 100644 --- a/arch/x86/kvm/trace.h +++ b/arch/x86/kvm/trace.h @@ -451,6 +451,24 @@ TRACE_EVENT(kvm_nested_vmexit_inject, __entry-exit_info1, __entry-exit_info2, __entry-exit_int_info, __entry-exit_int_info_err) ); + +/* + * Tracepoint for nested #vmexit because of interrupt pending + */ +TRACE_EVENT(kvm_nested_intr_vmexit, + TP_PROTO(__u64 rip), + TP_ARGS(rip), + + TP_STRUCT__entry( + __field(__u64, rip ) + ), + + TP_fast_assign( + __entry-rip= rip + ), + + TP_printk(rip=0x%016llx\n, __entry-rip) +); #endif /* _TRACE_KVM_H */ /* This part must be outside protection */ diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index c7ea2d8..e0a8517 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -5000,3 +5000,4 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_cr); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_nested_vmrun); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_nested_vmexit); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_nested_vmexit_inject); +EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_nested_intr_vmexit); -- 1.6.4.3 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 10/10] KVM: SVM: Remove nsvm_printk debugging code
With all important informations now delivered through tracepoints we can savely remove the nsvm_printk debugging code for nested svm. Signed-off-by: Joerg Roedel joerg.roe...@amd.com --- arch/x86/kvm/svm.c | 34 -- 1 files changed, 0 insertions(+), 34 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 5082558..acdb8d8 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -53,15 +53,6 @@ MODULE_LICENSE(GPL); #define DEBUGCTL_RESERVED_BITS (~(0x3fULL)) -/* Turn on to get debugging output*/ -/* #define NESTED_DEBUG */ - -#ifdef NESTED_DEBUG -#define nsvm_printk(fmt, args...) printk(KERN_INFO fmt, ## args) -#else -#define nsvm_printk(fmt, args...) do {} while(0) -#endif - static const u32 host_save_user_msrs[] = { #ifdef CONFIG_X86_64 MSR_STAR, MSR_LSTAR, MSR_CSTAR, MSR_SYSCALL_MASK, MSR_KERNEL_GS_BASE, @@ -1537,14 +1528,12 @@ static int nested_svm_exit_handled(struct vcpu_svm *svm) } default: { u64 exit_bits = 1ULL (exit_code - SVM_EXIT_INTR); - nsvm_printk(exit code: 0x%x\n, exit_code); if (svm-nested.intercept exit_bits) vmexit = NESTED_EXIT_DONE; } } if (vmexit == NESTED_EXIT_DONE) { - nsvm_printk(#VMEXIT reason=%04x\n, exit_code); nested_svm_vmexit(svm); } @@ -1655,10 +1644,6 @@ static int nested_svm_vmexit(struct vcpu_svm *svm) /* Restore the original control entries */ copy_vmcb_control_area(vmcb, hsave); - /* Kill any pending exceptions */ - if (svm-vcpu.arch.exception.pending == true) - nsvm_printk(WARNING: Pending Exception\n); - kvm_clear_exception_queue(svm-vcpu); kvm_clear_interrupt_queue(svm-vcpu); @@ -1823,25 +1808,14 @@ static bool nested_svm_vmrun(struct vcpu_svm *svm) force_new_asid(svm-vcpu); svm-vmcb-control.int_ctl = nested_vmcb-control.int_ctl | V_INTR_MASKING_MASK; - if (nested_vmcb-control.int_ctl V_IRQ_MASK) { - nsvm_printk(nSVM Injecting Interrupt: 0x%x\n, - nested_vmcb-control.int_ctl); - } if (nested_vmcb-control.int_ctl V_INTR_MASKING_MASK) svm-vcpu.arch.hflags |= HF_VINTR_MASK; else svm-vcpu.arch.hflags = ~HF_VINTR_MASK; - nsvm_printk(nSVM exit_int_info: 0x%x | int_state: 0x%x\n, - nested_vmcb-control.exit_int_info, - nested_vmcb-control.int_state); - svm-vmcb-control.int_vector = nested_vmcb-control.int_vector; svm-vmcb-control.int_state = nested_vmcb-control.int_state; svm-vmcb-control.tsc_offset += nested_vmcb-control.tsc_offset; - if (nested_vmcb-control.event_inj SVM_EVTINJ_VALID) - nsvm_printk(Injecting Event: 0x%x\n, - nested_vmcb-control.event_inj); svm-vmcb-control.event_inj = nested_vmcb-control.event_inj; svm-vmcb-control.event_inj_err = nested_vmcb-control.event_inj_err; @@ -1910,8 +1884,6 @@ static int vmsave_interception(struct vcpu_svm *svm) static int vmrun_interception(struct vcpu_svm *svm) { - nsvm_printk(VMrun\n); - if (nested_svm_check_permissions(svm)) return 1; @@ -1971,7 +1943,6 @@ static int clgi_interception(struct vcpu_svm *svm) static int invlpga_interception(struct vcpu_svm *svm) { struct kvm_vcpu *vcpu = svm-vcpu; - nsvm_printk(INVLPGA\n); trace_kvm_invlpga(svm-vmcb-save.rip, vcpu-arch.regs[VCPU_REGS_RCX], vcpu-arch.regs[VCPU_REGS_RAX]); @@ -2379,10 +2350,6 @@ static int handle_exit(struct kvm_vcpu *vcpu) svm-vmcb-control.exit_int_info, svm-vmcb-control.exit_int_info_err); - nsvm_printk(nested handle_exit: 0x%x | 0x%lx | 0x%lx | 0x%lx\n, - exit_code, svm-vmcb-control.exit_info_1, - svm-vmcb-control.exit_info_2, svm-vmcb-save.rip); - vmexit = nested_svm_exit_special(svm); if (vmexit == NESTED_EXIT_CONTINUE) @@ -2529,7 +2496,6 @@ static int svm_interrupt_allowed(struct kvm_vcpu *vcpu) static void enable_irq_window(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu); - nsvm_printk(Trying to open IRQ window\n); nested_svm_intr(svm); -- 1.6.4.3 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: sync guest calls made async on host - SQLite performance
(Resending to the list without multi-part). I now have more information. Dustin, The version used was 0.11.0-rc2, from the 2009-09-11 karmic daily build. The VM identifies itself as AMD QEMU Virtual CPU version 0.10.92 stepping 03. When you indicated that you had attempted to reproduce the problem, what mechanism did you use? Was it Karmic + KVM as the host and Karmic as the guest? What test did you use? I will re-open the launchpad bug if you believe it makes sense to continue the discussions there. Anthony, If you can suspend your disbelief for a short while and ask questions to clarify the details. My only interest here is to understand the results presented by the benchmark and determine if there are data integrity risks. Fundamentally, if there are modes of operation that applications can get a considerable performance boost by running the same OS under KVM then there will be lots of people happy. But realistically it is an indication of something wrong, misconfigured or just broken it bears at least some discussion. Bear in mind that upstream is relevant for KVM, but for distributions shipping KVM, they may have secondary concerns about patchesets and upstream changes that may be relevant for how they support their customers. Regards, Matthew Original Message Subject: Re: sync guest calls made async on host - SQLite performance From: Anthony Liguori anth...@codemonkey.ws To: Matthew Tippett tippe...@gmail.com Cc: Avi Kivity a...@redhat.com, RW k...@tauceti.net, kvm@vger.kernel.org Date: 09/29/2009 04:51 PM Matthew Tippett wrote: Your confidence is misplaced apparently. and I have pieced together the following information. I should be able to get the actual daily build number but broadly it looks like it was Ubuntu 9.10 daily snapshot (~ 9th - 21st September) Linux 2.6.31 (packaged as 2.6.31-10.30 to 2.6.31-10.32) qemu-kvm 0.11 (packaged as 0.11.0~rc2-0ubuntu to 0.11.0~rc2-0ubuntu5 That's extremely unlikely. But, if it turned out to be Ubuntu 9.10, linux 2.6.31, qemu-kvm 0.11 would there be any concerns? It's not relevant because it's not qemu-kvm-0.11. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[KVM-AUTOTEST PATCH 1/7] KVM test: migration test: move the bulk of the code to a utility function
Move most of the code to a utility function in kvm_test_utils.py, in order to make it reusable. Signed-off-by: Michael Goldish mgold...@redhat.com --- client/tests/kvm/kvm_test_utils.py | 72 +++ client/tests/kvm/tests/migration.py | 62 +- 2 files changed, 74 insertions(+), 60 deletions(-) diff --git a/client/tests/kvm/kvm_test_utils.py b/client/tests/kvm/kvm_test_utils.py index 601b350..096a056 100644 --- a/client/tests/kvm/kvm_test_utils.py +++ b/client/tests/kvm/kvm_test_utils.py @@ -59,3 +59,75 @@ def wait_for_login(vm, nic_index=0, timeout=240): raise error.TestFail(Could not log into guest '%s' % vm.name) logging.info(Logged in) return session + + +def migrate(vm, env=None): + +Migrate a VM locally and re-register it in the environment. + +@param vm: The VM to migrate. +@param env: The environment dictionary. If omitted, the migrated VM will +not be registered. +@return: The post-migration VM. + +# Helper functions +def mig_finished(): +s, o = vm.send_monitor_cmd(info migrate) +return s == 0 and not Migration status: active in o + +def mig_succeeded(): +s, o = vm.send_monitor_cmd(info migrate) +return s == 0 and Migration status: completed in o + +def mig_failed(): +s, o = vm.send_monitor_cmd(info migrate) +return s == 0 and Migration status: failed in o + +# See if migration is supported +s, o = vm.send_monitor_cmd(help info) +if not info migrate in o: +raise error.TestError(Migration is not supported) + +# Clone the source VM and ask the clone to wait for incoming migration +dest_vm = vm.clone() +dest_vm.create(for_migration=True) + +try: +# Define the migration command +cmd = migrate -d tcp:localhost:%d % dest_vm.migration_port +logging.debug(Migrating with command: %s % cmd) + +# Migrate +s, o = vm.send_monitor_cmd(cmd) +if s: +logging.error(Migration command failed (command: %r, output: %r) + % (cmd, o)) +raise error.TestFail(Migration command failed) + +# Wait for migration to finish +if not kvm_utils.wait_for(mig_finished, 90, 2, 2, + Waiting for migration to finish...): +raise error.TestFail(Timeout elapsed while waiting for migration + to finish) + +# Report migration status +if mig_succeeded(): +logging.info(Migration finished successfully) +elif mig_failed(): +raise error.TestFail(Migration failed) +else: +raise error.TestFail(Migration ended with unknown status) + +# Kill the source VM +vm.destroy(gracefully=False) + +# Replace the source VM with the new cloned VM +if env is not None: +kvm_utils.env_register_vm(env, vm.name, dest_vm) + +# Return the new cloned VM +return dest_vm + +except: +dest_vm.destroy() +raise diff --git a/client/tests/kvm/tests/migration.py b/client/tests/kvm/tests/migration.py index 2bbf17b..4b13b5d 100644 --- a/client/tests/kvm/tests/migration.py +++ b/client/tests/kvm/tests/migration.py @@ -21,79 +21,21 @@ def run_migration(test, params, env): vm = kvm_test_utils.get_living_vm(env, params.get(main_vm)) -# See if migration is supported -s, o = vm.send_monitor_cmd(help info) -if not info migrate in o: -raise error.TestError(Migration is not supported) - # Log into guest and get the output of migration_test_command session = kvm_test_utils.wait_for_login(vm) migration_test_command = params.get(migration_test_command) reference_output = session.get_command_output(migration_test_command) session.close() -# Clone the main VM and ask it to wait for incoming migration -dest_vm = vm.clone() -dest_vm.create(for_migration=True) - -try: -# Define the migration command -cmd = migrate -d tcp:localhost:%d % dest_vm.migration_port -logging.debug(Migration command: %s % cmd) - -# Migrate -s, o = vm.send_monitor_cmd(cmd) -if s: -logging.error(Migration command failed (command: %r, output: %r) - % (cmd, o)) -raise error.TestFail(Migration command failed) - -# Define some helper functions -def mig_finished(): -s, o = vm.send_monitor_cmd(info migrate) -return s == 0 and not Migration status: active in o - -def mig_succeeded(): -s, o = vm.send_monitor_cmd(info migrate) -return s == 0 and Migration status: completed in o - -def mig_failed(): -s, o = vm.send_monitor_cmd(info migrate) -return s == 0 and Migration status: failed in o - -# Wait for
[KVM-AUTOTEST PATCH 2/7] KVM test: timedrift test: move the get_time() helper function to kvm_test_utils.py
Move get_time() to kvm_test_utils.py to make it reusable. Signed-off-by: Michael Goldish mgold...@redhat.com --- client/tests/kvm/kvm_test_utils.py | 27 ++ client/tests/kvm/tests/timedrift.py | 164 -- 2 files changed, 104 insertions(+), 87 deletions(-) diff --git a/client/tests/kvm/kvm_test_utils.py b/client/tests/kvm/kvm_test_utils.py index 096a056..db9f666 100644 --- a/client/tests/kvm/kvm_test_utils.py +++ b/client/tests/kvm/kvm_test_utils.py @@ -131,3 +131,30 @@ def migrate(vm, env=None): except: dest_vm.destroy() raise + + +def get_time(session, time_command, time_filter_re, time_format): + +Return the host time and guest time. If the guest time cannot be fetched +a TestError exception is raised. + +Note that the shell session should be ready to receive commands +(i.e. should display a command prompt and should be done with all +previous commands). + +@param session: A shell session. +@param time_command: Command to issue to get the current guest time. +@param time_filter_re: Regex filter to apply on the output of +time_command in order to get the current time. +@param time_format: Format string to pass to time.strptime() with the +result of the regex filter. +@return: A tuple containing the host time and guest time. + +host_time = time.time() +session.sendline(time_command) +(match, s) = session.read_up_to_prompt() +if not match: +raise error.TestError(Could not get guest time) +s = re.findall(time_filter_re, s)[0] +guest_time = time.mktime(time.strptime(s, time_format)) +return (host_time, guest_time) diff --git a/client/tests/kvm/tests/timedrift.py b/client/tests/kvm/tests/timedrift.py index fe0653e..146fa12 100644 --- a/client/tests/kvm/tests/timedrift.py +++ b/client/tests/kvm/tests/timedrift.py @@ -52,25 +52,6 @@ def run_timedrift(test, params, env): for tid, mask in prev_masks.items(): commands.getoutput(taskset -p %s %s % (mask, tid)) -def get_time(session, time_command, time_filter_re, time_format): - -Returns the host time and guest time. - -@param session: A shell session. -@param time_command: Command to issue to get the current guest time. -@param time_filter_re: Regex filter to apply on the output of -time_command in order to get the current time. -@param time_format: Format string to pass to time.strptime() with the -result of the regex filter. -@return: A tuple containing the host time and guest time. - -host_time = time.time() -session.sendline(time_command) -(match, s) = session.read_up_to_prompt() -s = re.findall(time_filter_re, s)[0] -guest_time = time.mktime(time.strptime(s, time_format)) -return (host_time, guest_time) - vm = kvm_test_utils.get_living_vm(env, params.get(main_vm)) session = kvm_test_utils.wait_for_login(vm) @@ -97,84 +78,93 @@ def run_timedrift(test, params, env): guest_load_sessions = [] host_load_sessions = [] -# Set the VM's CPU affinity -prev_affinity = set_cpu_affinity(vm.get_pid(), cpu_mask) - try: -# Get time before load -(host_time_0, guest_time_0) = get_time(session, time_command, - time_filter_re, time_format) - -# Run some load on the guest -logging.info(Starting load on guest...) -for i in range(guest_load_instances): -load_session = vm.remote_login() -if not load_session: -raise error.TestFail(Could not log into guest) -load_session.set_output_prefix((guest load %d) % i) -load_session.set_output_func(logging.debug) -load_session.sendline(guest_load_command) -guest_load_sessions.append(load_session) - -# Run some load on the host -logging.info(Starting load on host...) -for i in range(host_load_instances): -host_load_sessions.append( -kvm_subprocess.run_bg(host_load_command, - output_func=logging.debug, - output_prefix=(host load %d) % i, - timeout=0.5)) -# Set the CPU affinity of the load process -pid = host_load_sessions[-1].get_pid() -set_cpu_affinity(pid, cpu_mask) - -# Sleep for a while (during load) -logging.info(Sleeping for %s seconds... % load_duration) -time.sleep(load_duration) - -# Get time delta after load -(host_time_1, guest_time_1) = get_time(session, time_command, - time_filter_re, time_format) - -# Report results -host_delta = host_time_1 - host_time_0 -guest_delta =
[KVM-AUTOTEST PATCH 4/7] KVM test: move the reboot code to kvm_test_utils.py
Move the reboot code from the boot test (tests/boot.py) to kvm_test_utils.py in order to make it reusable. Signed-off-by: Michael Goldish mgold...@redhat.com --- client/tests/kvm/kvm_test_utils.py | 44 client/tests/kvm/tests/boot.py | 33 +- 2 files changed, 51 insertions(+), 26 deletions(-) diff --git a/client/tests/kvm/kvm_test_utils.py b/client/tests/kvm/kvm_test_utils.py index db9f666..4d7b1c3 100644 --- a/client/tests/kvm/kvm_test_utils.py +++ b/client/tests/kvm/kvm_test_utils.py @@ -61,6 +61,50 @@ def wait_for_login(vm, nic_index=0, timeout=240): return session +def reboot(vm, session, method=shell, sleep_before_reset=10, nic_index=0, + timeout=240): + +Reboot the VM and wait for it to come back up by trying to log in until +timeout expires. + +@param vm: VM object. +@param session: A shell session object. +@param method: Reboot method. Can be shell (send a shell reboot +command) or system_reset (send a system_reset monitor command). +@param nic_index: Index of NIC to access in the VM, when logging in after +rebooting. +@param timeout: Time to wait before giving up (after rebooting). +@return: A new shell session object. + +if method == shell: +# Send a reboot command to the guest's shell +session.sendline(vm.get_params().get(reboot_command)) +logging.info(Reboot command sent; waiting for guest to go down...) +elif method == system_reset: +# Sleep for a while before sending the command +time.sleep(sleep_before_reset) +# Send a system_reset monitor command +vm.send_monitor_cmd(system_reset) +logging.info(system_reset monitor command sent; waiting for guest to + go down...) +else: +logging.error(Unknown reboot method: %s % method) + +# Wait for the session to become unresponsive and close it +if not kvm_utils.wait_for(lambda: not session.is_responsive(), 120, 0, 1): +raise error.TestFail(Guest refuses to go down) +session.close() + +# Try logging into the guest until timeout expires +logging.info(Guest is down; waiting for it to go up again...) +session = kvm_utils.wait_for(lambda: vm.remote_login(nic_index=nic_index), + timeout, 0, 2) +if not session: +raise error.TestFail(Could not log into guest after reboot) +logging.info(Guest is up again) +return session + + def migrate(vm, env=None): Migrate a VM locally and re-register it in the environment. diff --git a/client/tests/kvm/tests/boot.py b/client/tests/kvm/tests/boot.py index 282efda..cd1f1d4 100644 --- a/client/tests/kvm/tests/boot.py +++ b/client/tests/kvm/tests/boot.py @@ -19,33 +19,14 @@ def run_boot(test, params, env): session = kvm_test_utils.wait_for_login(vm) try: -if params.get(reboot_method) == shell: -# Send a reboot command to the guest's shell -session.sendline(vm.get_params().get(reboot_command)) -logging.info(Reboot command sent; waiting for guest to go - down...) -elif params.get(reboot_method) == system_reset: -# Sleep for a while -- give the guest a chance to finish booting -time.sleep(float(params.get(sleep_before_reset, 10))) -# Send a system_reset monitor command -vm.send_monitor_cmd(system_reset) -logging.info(system_reset monitor command sent; waiting for - guest to go down...) -else: return +if not params.get(reboot_method): +return -# Wait for the session to become unresponsive -if not kvm_utils.wait_for(lambda: not session.is_responsive(), - 120, 0, 1): -raise error.TestFail(Guest refuses to go down) +# Reboot the VM +session = kvm_test_utils.reboot(vm, session, +params.get(reboot_method), +float(params.get(sleep_before_reset, + 10))) finally: session.close() - -logging.info(Guest is down; waiting for it to go up again...) - -session = kvm_utils.wait_for(vm.remote_login, 240, 0, 2) -if not session: -raise error.TestFail(Could not log into guest after reboot) -session.close() - -logging.info(Guest is up again) -- 1.5.4.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[KVM-AUTOTEST PATCH 6/7] KVM test: add option to kill all unresponsive VMs at the end of each test
This is useful for tests that may leave VMs in a bad state but can't afford to use kill_vm_on_error = yes. For example, timedrift.with_reboot can fail because the reboot failed or because the time drift was too large. In the latter case there's no reason to kill the VM. Signed-off-by: Michael Goldish mgold...@redhat.com --- client/tests/kvm/kvm_preprocessing.py | 12 client/tests/kvm/kvm_tests.cfg.sample |1 + 2 files changed, 13 insertions(+), 0 deletions(-) diff --git a/client/tests/kvm/kvm_preprocessing.py b/client/tests/kvm/kvm_preprocessing.py index 26f7f8e..e624a42 100644 --- a/client/tests/kvm/kvm_preprocessing.py +++ b/client/tests/kvm/kvm_preprocessing.py @@ -293,6 +293,18 @@ def postprocess(test, params, env): int(params.get(post_command_timeout, 600)), params.get(post_command_noncritical) == yes) +# Kill all unresponsive VMs +if params.get(kill_unresponsive_vms) == yes: +logging.debug('kill_unresponsive_vms' specified; killing all VMs + that fail to respond to a remote login request...) +for vm in kvm_utils.env_get_all_vms(env): +if vm.is_alive(): +session = vm.remote_login() +if session: +session.close() +else: +vm.destroy(gracefully=False) + # Kill the tailing threads of all VMs for vm in kvm_utils.env_get_all_vms(env): vm.kill_tail_thread() diff --git a/client/tests/kvm/kvm_tests.cfg.sample b/client/tests/kvm/kvm_tests.cfg.sample index e80b645..c4d8a60 100644 --- a/client/tests/kvm/kvm_tests.cfg.sample +++ b/client/tests/kvm/kvm_tests.cfg.sample @@ -13,6 +13,7 @@ convert_ppm_files_to_png_on_error = yes #keep_ppm_files_on_error = yes kill_vm = no kill_vm_gracefully = yes +kill_unresponsive_vms = yes # Some default VM params qemu_binary = qemu -- 1.5.4.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[KVM-AUTOTEST PATCH 3/7] KVM test: new test timedrift_with_migration
This patch adds a new test that checks the timedrift introduced by migrations. It uses the same parameters used by the timedrift test to get the guest time. In addition, the number of migrations the test performs is controlled by the parameter 'migration_iterations'. Signed-off-by: Michael Goldish mgold...@redhat.com --- client/tests/kvm/kvm_tests.cfg.sample | 33 --- client/tests/kvm/tests/timedrift_with_migration.py | 95 2 files changed, 115 insertions(+), 13 deletions(-) create mode 100644 client/tests/kvm/tests/timedrift_with_migration.py diff --git a/client/tests/kvm/kvm_tests.cfg.sample b/client/tests/kvm/kvm_tests.cfg.sample index 540d0a2..618c21e 100644 --- a/client/tests/kvm/kvm_tests.cfg.sample +++ b/client/tests/kvm/kvm_tests.cfg.sample @@ -100,19 +100,26 @@ variants: type = linux_s3 - timedrift:install setup -type = timedrift extra_params += -rtc-td-hack -# Pin the VM and host load to CPU #0 -cpu_mask = 0x1 -# Set the load and rest durations -load_duration = 20 -rest_duration = 20 -# Fail if the drift after load is higher than 50% -drift_threshold = 50 -# Fail if the drift after the rest period is higher than 10% -drift_threshold_after_rest = 10 -# For now, make sure this test is executed alone -used_cpus = 100 +variants: +- with_load: +type = timedrift +# Pin the VM and host load to CPU #0 +cpu_mask = 0x1 +# Set the load and rest durations +load_duration = 20 +rest_duration = 20 +# Fail if the drift after load is higher than 50% +drift_threshold = 50 +# Fail if the drift after the rest period is higher than 10% +drift_threshold_after_rest = 10 +# For now, make sure this test is executed alone +used_cpus = 100 +- with_migration: +type = timedrift_with_migration +migration_iterations = 3 +drift_threshold = 10 +drift_threshold_single = 3 - stress_boot: install setup type = stress_boot @@ -581,7 +588,7 @@ variants: extra_params += -smp 2 used_cpus = 2 stress_boot: used_cpus = 10 -timedrift: used_cpus = 100 +timedrift.with_load: used_cpus = 100 variants: diff --git a/client/tests/kvm/tests/timedrift_with_migration.py b/client/tests/kvm/tests/timedrift_with_migration.py new file mode 100644 index 000..139b663 --- /dev/null +++ b/client/tests/kvm/tests/timedrift_with_migration.py @@ -0,0 +1,95 @@ +import logging, time, commands, re +from autotest_lib.client.common_lib import error +import kvm_subprocess, kvm_test_utils, kvm_utils + + +def run_timedrift_with_migration(test, params, env): + +Time drift test with migration: + +1) Log into a guest. +2) Take a time reading from the guest and host. +3) Migrate the guest. +4) Take a second time reading. +5) If the drift (in seconds) is higher than a user specified value, fail. + +@param test: KVM test object. +@param params: Dictionary with test parameters. +@param env: Dictionary with the test environment. + +vm = kvm_test_utils.get_living_vm(env, params.get(main_vm)) +session = kvm_test_utils.wait_for_login(vm) + +# Collect test parameters: +# Command to run to get the current time +time_command = params.get(time_command) +# Filter which should match a string to be passed to time.strptime() +time_filter_re = params.get(time_filter_re) +# Time format for time.strptime() +time_format = params.get(time_format) +drift_threshold = float(params.get(drift_threshold, 10)) +drift_threshold_single = float(params.get(drift_threshold_single, 3)) +migration_iterations = int(params.get(migration_iterations, 1)) + +try: +# Get initial time +# (ht stands for host time, gt stands for guest time) +(ht0, gt0) = kvm_test_utils.get_time(session, time_command, + time_filter_re, time_format) + +# Migrate +for i in range(migration_iterations): +# Get time before current iteration +(ht0_, gt0_) = kvm_test_utils.get_time(session, time_command, + time_filter_re, time_format) +session.close() +# Run current iteration +logging.info(Migrating: iteration %d of %d... % + (i + 1, migration_iterations)) +vm = kvm_test_utils.migrate(vm, env) +# Log in +logging.info(Logging in after migration...) +session = vm.remote_login() +if not session: +raise error.TestFail(Could not log in
[KVM-AUTOTEST PATCH 7/7] KVM test: kvm_preprocessing.py: fix indentation and logging messages in postprocess_vm
Signed-off-by: Michael Goldish mgold...@redhat.com --- client/tests/kvm/kvm_preprocessing.py |9 ++--- 1 files changed, 6 insertions(+), 3 deletions(-) diff --git a/client/tests/kvm/kvm_preprocessing.py b/client/tests/kvm/kvm_preprocessing.py index e624a42..5bae2bd 100644 --- a/client/tests/kvm/kvm_preprocessing.py +++ b/client/tests/kvm/kvm_preprocessing.py @@ -116,9 +116,12 @@ def postprocess_vm(test, params, env, name): vm.send_monitor_cmd(screendump %s % scrdump_filename) if params.get(kill_vm) == yes: -if not kvm_utils.wait_for(vm.is_dead, -float(params.get(kill_vm_timeout, 0)), 0.0, 1.0, -Waiting for VM to kill itself...): +kill_vm_timeout = float(params.get(kill_vm_timeout, 0)) +if kill_vm_timeout: +logging.debug('kill_vm' specified; waiting for VM to shut down + before killing it...) +kvm_utils.wait_for(vm.is_dead, kill_vm_timeout, 0, 1) +else: logging.debug('kill_vm' specified; killing VM...) vm.destroy(gracefully = params.get(kill_vm_gracefully) == yes) -- 1.5.4.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[KVM-AUTOTEST PATCH 5/7] KVM test: new test timedrift_with_reboot
Checks the time drift introduced by several reboots (default 1). Signed-off-by: Michael Goldish mgold...@redhat.com --- client/tests/kvm/kvm_tests.cfg.sample |5 ++ client/tests/kvm/tests/timedrift_with_reboot.py | 88 +++ 2 files changed, 93 insertions(+), 0 deletions(-) create mode 100644 client/tests/kvm/tests/timedrift_with_reboot.py diff --git a/client/tests/kvm/kvm_tests.cfg.sample b/client/tests/kvm/kvm_tests.cfg.sample index 618c21e..e80b645 100644 --- a/client/tests/kvm/kvm_tests.cfg.sample +++ b/client/tests/kvm/kvm_tests.cfg.sample @@ -120,6 +120,11 @@ variants: migration_iterations = 3 drift_threshold = 10 drift_threshold_single = 3 +- with_reboot: +type = timedrift_with_reboot +reboot_iterations = 1 +drift_threshold = 10 +drift_threshold_single = 3 - stress_boot: install setup type = stress_boot diff --git a/client/tests/kvm/tests/timedrift_with_reboot.py b/client/tests/kvm/tests/timedrift_with_reboot.py new file mode 100644 index 000..642daaf --- /dev/null +++ b/client/tests/kvm/tests/timedrift_with_reboot.py @@ -0,0 +1,88 @@ +import logging, time, commands, re +from autotest_lib.client.common_lib import error +import kvm_subprocess, kvm_test_utils, kvm_utils + + +def run_timedrift_with_reboot(test, params, env): + +Time drift test with reboot: + +1) Log into a guest. +2) Take a time reading from the guest and host. +3) Reboot the guest. +4) Take a second time reading. +5) If the drift (in seconds) is higher than a user specified value, fail. + +@param test: KVM test object. +@param params: Dictionary with test parameters. +@param env: Dictionary with the test environment. + +vm = kvm_test_utils.get_living_vm(env, params.get(main_vm)) +session = kvm_test_utils.wait_for_login(vm) + +# Collect test parameters: +# Command to run to get the current time +time_command = params.get(time_command) +# Filter which should match a string to be passed to time.strptime() +time_filter_re = params.get(time_filter_re) +# Time format for time.strptime() +time_format = params.get(time_format) +drift_threshold = float(params.get(drift_threshold, 10)) +drift_threshold_single = float(params.get(drift_threshold_single, 3)) +reboot_iterations = int(params.get(reboot_iterations, 1)) + +try: +# Get initial time +# (ht stands for host time, gt stands for guest time) +(ht0, gt0) = kvm_test_utils.get_time(session, time_command, + time_filter_re, time_format) + +# Reboot +for i in range(reboot_iterations): +# Get time before current iteration +(ht0_, gt0_) = kvm_test_utils.get_time(session, time_command, + time_filter_re, time_format) +# Run current iteration +logging.info(Rebooting: iteration %d of %d... % + (i + 1, reboot_iterations)) +session = kvm_test_utils.reboot(vm, session) +# Get time after current iteration +(ht1_, gt1_) = kvm_test_utils.get_time(session, time_command, + time_filter_re, time_format) +# Report iteration results +host_delta = ht1_ - ht0_ +guest_delta = gt1_ - gt0_ +drift = abs(host_delta - guest_delta) +logging.info(Host duration (iteration %d): %.2f % + (i + 1, host_delta)) +logging.info(Guest duration (iteration %d): %.2f % + (i + 1, guest_delta)) +logging.info(Drift at iteration %d: %.2f seconds % + (i + 1, drift)) +# Fail if necessary +if drift drift_threshold_single: +raise error.TestFail(Time drift too large at iteration %d: + %.2f seconds % (i + 1, drift)) + +# Get final time +(ht1, gt1) = kvm_test_utils.get_time(session, time_command, + time_filter_re, time_format) + +finally: +session.close() + +# Report results +host_delta = ht1 - ht0 +guest_delta = gt1 - gt0 +drift = abs(host_delta - guest_delta) +logging.info(Host duration (%d reboots): %.2f % + (reboot_iterations, host_delta)) +logging.info(Guest duration (%d reboots): %.2f % + (reboot_iterations, guest_delta)) +logging.info(Drift after %d reboots: %.2f seconds % + (reboot_iterations, drift)) + +# Fail if necessary +if drift drift_threshold: +raise error.TestFail(Time drift too large after %d reboots: + %.2f seconds % (reboot_iterations, drift))
KVM crash on kvm-88 + kernel 2.6.30.6
Hi I had a KVM crash today with WinXP Pro SP3 32bit on kvm-88 + kernel 2.6.30.6 (modules from kernel). [ cut here ] WARNING: at arch/x86/kvm/x86.c:204 kvm_queue_exception_e+0x6b/0x80 [kvm]() Hardware name: Latitude D520 Modules linked in: ipv6 i915 drm i2c_algo_bit rfcomm sco bridge stp llc bnep l2cap btusb bluetooth fan usbhid hid fuse snd_seq_dummy tun snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device kvm_intel kvm snd_pcm_oss joydev snd_hda_codec_idt snd_mixer_oss ohci1394 ieee1394 snd_hda_intel arc4 snd_hda_codec snd_hwdep ecb snd_pcm yenta_socket rsrc_nonstatic dell_laptop snd_timer snd soundcore snd_page_alloc uhci_hcd psmouse video thermal output ac iTCO_wdt iTCO_vendor_support wmi processor button iwl3945 ehci_hcd battery usbcore dcdbas i2c_i801 i2c_core intel_agp sg iwlcore rfkill mac80211 led_class agpgart serio_raw evdev cfg80211 slhc b44 ssb pcmcia pcmcia_core mii rtc_cmos rtc_core rtc_lib ext3 jbd mbcache aes_i586 aes_generic xts gf128mul dm_crypt dm_mod sd_mod sr_mod cdrom ata_piix ata_generic pata_acpi libata scsi_mod Pid: 4238, comm: qemu-kvm Not tainted 2.6.30-ARCH #1 Call Trace: [c013b26a] ? warn_slowpath_common+0x7a/0xc0 [f8a3451b] ? kvm_queue_exception_e+0x6b/0x80 [kvm] [c013b2d0] ? warn_slowpath_null+0x20/0x40 [f8a3451b] ? kvm_queue_exception_e+0x6b/0x80 [kvm] [f8a35927] ? kvm_task_switch+0x277/0xc60 [kvm] [f8a3d842] ? mmu_sync_children+0x182/0x340 [kvm] [f8a42015] ? x86_emulate_insn+0x1895/0x3cb0 [kvm] [f8a3ff96] ? x86_decode_insn+0x686/0xd50 [kvm] [f8a654be] ? vmx_fpu_deactivate+0x4e/0x70 [kvm_intel] [f8a66b61] ? handle_task_switch+0x31/0xc0 [kvm_intel] [f8a67161] ? kvm_handle_exit+0xf1/0x200 [kvm_intel] [c012a5a4] ? kunmap_atomic+0x84/0xb0 [c012a55c] ? kunmap_atomic+0x3c/0xb0 [f8a34dcd] ? kvm_arch_vcpu_ioctl_run+0x34d/0xaf0 [kvm] [f8a2ab5e] ? kvm_vcpu_ioctl+0x45e/0x7f0 [kvm] [c0166dba] ? futex_wake+0xfa/0x110 [f8a2a700] ? kvm_vcpu_ioctl+0x0/0x7f0 [kvm] [c01dfcb2] ? vfs_ioctl+0x22/0xa0 [c01dfdb9] ? do_vfs_ioctl+0x89/0x5a0 [c0141bca] ? irq_exit+0x3a/0x90 [c01688fa] ? sys_futex+0xca/0x160 [c01e035e] ? sys_ioctl+0x8e/0xb0 [c0103cb3] ? sysenter_do_call+0x12/0x28 ---[ end trace 82bf054de8a6387e ]--- Misc info: -- The Blue Screen Of Death gave the following error: *** STOP: 0x008E (0xC01D, 0xF7A48D3E, 0xF7C5ECC0, 0x) *** processr.sys - Address F7A48D3E base at F7A47000, DateStamp 48025181 -- Startup command: qemu-kvm -usbdevice tablet -net nic,macaddr=52:55:00:00:00:01,model=rtl8139 -net tap,ifname=tap0 -vnc :1 -smp 2 -m 1024 -cdrom /home/kenni/images/NETKVM-20081229.iso -hda /data/virtualization/winxp.img -boot c -localtime -daemonize -monitor unix:/var/run/kvm/01_winxp.socket,server,nowait -- $ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Core(TM)2 CPU T7200 @ 2.00GHz stepping: 6 cpu MHz : 1995.143 cache size : 4096 KB physical id : 0 siblings: 2 core id : 0 cpu cores : 2 apicid : 0 initial apicid : 0 fdiv_bug: no hlt_bug : no f00f_bug: no coma_bug: no fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc arch_perfmon pebs bts pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm lahf_lm tpr_shadow bogomips: 3991.54 clflush size: 64 power management: snip -- $ uname -a Linux D520 2.6.30-ARCH #1 SMP PREEMPT Wed Sep 9 12:37:32 UTC 2009 i686 Intel(R) Core(TM)2 CPU T7200 @ 2.00GHz GenuineIntel GNU/Linux -- I don't suppose that this is enough information for you to identify the issue? As I haven't had this crash before, I don't know if I'll see it again...can I do something to collect better debugging information in the future? Perhaps enable kernel memory dump in Windows or something else? Best Regards Kenni Lund -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: sync guest calls made async on host - SQLite performance
On Wed, Oct 7, 2009 at 11:53 AM, Matthew Tippett tippe...@gmail.com wrote: When you indicated that you had attempted to reproduce the problem, what mechanism did you use? Was it Karmic + KVM as the host and Karmic as the guest? What test did you use? I ran the following in several places: a) on the system running on real hardware, time dd if=/dev/zero of=$HOME/foo bs=1M count=500 524288000 bytes (524 MB) copied, 9.72614 s, 53.9 MB/s b) in an vm running on qemu-kvm-0.11 on Karmic time dd if=/dev/zero of=$HOME/foo bs=1M count=500 oflag=direct 524288000 bytes (524 MB) copied, 31.6961 s, 16.5 MB/s c) in a vm running on kvm-84 on Jaunty time dd if=/dev/zero of=$HOME/foo bs=1M count=500 oflag=direct 524288000 bytes (524 MB) copied, 22.2169 s, 23.6 MB/s Looking at the time it takes to write a 500MB file to a real hard disk, and then inside of the VM. If I were to experience the problem on Karmic, I would have seen this dd of a 500MB file take far, far less time than it takes to write that file to disk on the real hardware. This was not the case in my testing. I will re-open the launchpad bug if you believe it makes sense to continue the discussions there. Please re-open the bug if you can describe a real test case that you used to demonstrate the problem. Without being rude, it's hard for me to work from a bug that says a magazine article says that there's a bug in the Ubuntu distribution of qemu-kvm-0.11. If you can provide clear steps that you have used to experience the problem, then I will be able to take this issue seriously, reproduce it myself, and develop a fix. :-Dustin -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
kvm: test: timer testcase
Test timer interrupts (HPET, LAPIC, PIT) in correlation with ACPI/TSC counters. New tests/variations are easy to add. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com Index: qemu-kvm/kvm/user/config-x86-common.mak === --- qemu-kvm.orig/kvm/user/config-x86-common.mak +++ qemu-kvm/kvm/user/config-x86-common.mak @@ -58,6 +58,9 @@ $(TEST_DIR)/tsc.flat: $(cstart.o) $(TEST $(TEST_DIR)/apic.flat: $(cstart.o) $(TEST_DIR)/apic.o $(TEST_DIR)/vm.o \ $(TEST_DIR)/print.o +$(TEST_DIR)/time.flat: $(cstart.o) $(TEST_DIR)/time.o $(TEST_DIR)/vm.o \ + $(TEST_DIR)/print.o + $(TEST_DIR)/realmode.flat: $(TEST_DIR)/realmode.o $(CC) -m32 -nostdlib -o $@ -Wl,-T,$(TEST_DIR)/realmode.lds $^ Index: qemu-kvm/kvm/user/config-x86_64.mak === --- qemu-kvm.orig/kvm/user/config-x86_64.mak +++ qemu-kvm/kvm/user/config-x86_64.mak @@ -7,6 +7,7 @@ CFLAGS += -D__x86_64__ tests = $(TEST_DIR)/access.flat $(TEST_DIR)/sieve.flat \ $(TEST_DIR)/simple.flat $(TEST_DIR)/stringio.flat \ $(TEST_DIR)/memtest1.flat $(TEST_DIR)/emulator.flat \ - $(TEST_DIR)/hypercall.flat $(TEST_DIR)/apic.flat + $(TEST_DIR)/hypercall.flat $(TEST_DIR)/apic.flat \ + $(TEST_DIR)/time.flat include config-x86-common.mak Index: qemu-kvm/kvm/user/test/x86/io.h === --- /dev/null +++ qemu-kvm/kvm/user/test/x86/io.h @@ -0,0 +1,35 @@ +static inline void outb(unsigned char val, unsigned short port) +{ + asm volatile(outb %0, %w1: : a(val), Nd (port)); +} + +static inline void outw(unsigned short val, unsigned short port) +{ + asm volatile(outw %0, %w1: : a(val), Nd (port)); +} + +static inline void outl(unsigned long val, unsigned short port) +{ + asm volatile(outl %0, %w1: : a(val), Nd (port)); +} + +static inline unsigned char inb(unsigned short port) +{ + unsigned char val; + asm volatile(inb %w1, %0: =a(val) : Nd (port)); + return val; +} + +static inline short inw(unsigned short port) +{ + short val; + asm volatile(inw %w1, %0: =a(val) : Nd (port)); + return val; +} + +static inline unsigned int inl(unsigned short port) +{ + unsigned int val; + asm volatile(inl %w1, %0: =a(val) : Nd (port)); + return val; +} Index: qemu-kvm/kvm/user/test/x86/time.c === --- /dev/null +++ qemu-kvm/kvm/user/test/x86/time.c @@ -0,0 +1,1010 @@ +#include libcflat.h +#include apic.h +#include vm.h +#include io.h + +typedef unsigned char u8; +typedef unsigned short u16; +typedef unsigned u32; +typedef unsigned long ulong; +typedef unsigned long long u64; + +#ifndef ARRAY_SIZE +#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) +#endif + +typedef struct { +unsigned short offset0; +unsigned short selector; +unsigned short ist : 3; +unsigned short : 5; +unsigned short type : 4; +unsigned short : 1; +unsigned short dpl : 2; +unsigned short p : 1; +unsigned short offset1; +#ifdef __x86_64__ +unsigned offset2; +unsigned reserved; +#endif +} idt_entry_t; + +typedef struct { +ulong rflags; +ulong cs; +ulong rip; +ulong func; +ulong regs[sizeof(ulong)*2]; +} isr_regs_t; + +#ifdef __x86_64__ +# define R r +#else +# define R e +#endif + +extern char isr_entry_point[]; + +asm ( +isr_entry_point: \n +#ifdef __x86_64__ +push %r15 \n\t +push %r14 \n\t +push %r13 \n\t +push %r12 \n\t +push %r11 \n\t +push %r10 \n\t +push %r9 \n\t +push %r8 \n\t +#endif +push %R di \n\t +push %R si \n\t +push %R bp \n\t +push %R sp \n\t +push %R bx \n\t +push %R dx \n\t +push %R cx \n\t +push %R ax \n\t +#ifdef __x86_64__ +mov %rsp, %rdi \n\t +callq *8*16(%rsp) \n\t +#else +push %esp \n\t +calll *4+4*8(%esp) \n\t +add $4, %esp \n\t +#endif +pop %R ax \n\t +pop %R cx \n\t +pop %R dx \n\t +pop %R bx \n\t +pop %R bp \n\t +pop %R bp \n\t +pop %R si \n\t +pop %R di \n\t +#ifdef __x86_64__ +pop %r8 \n\t +pop %r9 \n\t +pop %r10 \n\t +pop %r11 \n\t +pop %r12 \n\t +pop %r13 \n\t +pop %r14 \n\t +pop %r15 \n\t +#endif +#ifdef __x86_64__ +add $8, %rsp \n\t +iretq \n\t +#else +add $4, %esp \n\t +iretl \n\t +#endif +); + +static idt_entry_t idt[256]; + +static int g_fail; +static int g_tests; + +static void report(const char *msg, int pass) +{ +++g_tests; +printf(%s: %s\n, msg, (pass ? PASS : FAIL)); +if (!pass) +++g_fail; +} + +static u16 read_cs(void) +{ +u16 v; + +asm(mov %%cs, %0 : =rm(v)); +return v; +} + +static void init_idt(void) +{ +struct { +u16 limit; +ulong idt; +} __attribute__((packed)) idt_ptr = { +sizeof(idt_entry_t)
Re: sync guest calls made async on host - SQLite performance
The benchmark used was the sqlite subtest in the phoronix test suite. My awareness and involvement is beyond reading a magazine article, I can elaborate if needed, but I don't believe it is necessary. Process for reproduction, assuming Karmic, # apt-get install phoronix-test-suite $ phoronix-test-suite benchmark sqlite Answer the questions (test-names, etc, etc), it will download sqlite, build it and execute the test. By default the test runs three timesand averages the results. The results experienced should be similar to the values identified at http://www.phoronix.com/scan.php?page=articleitem=linux_2631_kvmnum=3 Which is approximately 12 minutes for the native, and about 60 seconds for the guest. Given that the performance under the guest is expected to be around 60 seconds, I would suggest confirming performance there first. Regards, Matthew Original Message Subject: Re: sync guest calls made async on host - SQLite performance From: Dustin Kirkland dustin.kirkl...@gmail.com To: Matthew Tippett tippe...@gmail.com Cc: Anthony Liguori anth...@codemonkey.ws, Avi Kivity a...@redhat.com, RW k...@tauceti.net, kvm@vger.kernel.org Date: 10/07/2009 02:59 PM On Wed, Oct 7, 2009 at 11:53 AM, Matthew Tippett tippe...@gmail.com wrote: When you indicated that you had attempted to reproduce the problem, what mechanism did you use? Was it Karmic + KVM as the host and Karmic as the guest? What test did you use? I ran the following in several places: a) on the system running on real hardware, time dd if=/dev/zero of=$HOME/foo bs=1M count=500 524288000 bytes (524 MB) copied, 9.72614 s, 53.9 MB/s b) in an vm running on qemu-kvm-0.11 on Karmic time dd if=/dev/zero of=$HOME/foo bs=1M count=500 oflag=direct 524288000 bytes (524 MB) copied, 31.6961 s, 16.5 MB/s c) in a vm running on kvm-84 on Jaunty time dd if=/dev/zero of=$HOME/foo bs=1M count=500 oflag=direct 524288000 bytes (524 MB) copied, 22.2169 s, 23.6 MB/s Looking at the time it takes to write a 500MB file to a real hard disk, and then inside of the VM. If I were to experience the problem on Karmic, I would have seen this dd of a 500MB file take far, far less time than it takes to write that file to disk on the real hardware. This was not the case in my testing. I will re-open the launchpad bug if you believe it makes sense to continue the discussions there. Please re-open the bug if you can describe a real test case that you used to demonstrate the problem. Without being rude, it's hard for me to work from a bug that says a magazine article says that there's a bug in the Ubuntu distribution of qemu-kvm-0.11. If you can provide clear steps that you have used to experience the problem, then I will be able to take this issue seriously, reproduce it myself, and develop a fix. :-Dustin -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: migrate_set_downtime bug
What is the reasoning behind such short downtimes? Are there any application that will fail with longer downtimes (let say 1s)? Note: on a 1Gbit/s net you can transfer only 10MB within 100ms which accounts for more than 2 thousand pages, which sounds like enough for a first pass to me. For the default case, It is hard to imagine an application dirtying more than 2k pages per- iteration simply encode or decode a mpeg video (or play a video). -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-commits] [COMMIT c0b1905] qemu/pci: reset device registers on bus reset
On Wed, Oct 07, 2009 at 10:20:19AM +0200, Avi Kivity wrote: On 10/05/2009 04:53 PM, Anthony Liguori wrote: From: Michael S. Tsirkinm...@redhat.com Reset BARs and a couple of other registers on bus reset, as per PCI spec. This commit breaks Windows XP restart. After a restart Windows switches from 800x600 cirrus logic vga to 640x480 standard vga. My guess is that this is due to two mutually-cancelling bugs: Could you please tell me how to reproduce the problem exactly? I tried several configurations but in all of them windows seems to keep 800x600 across restarts, and shows display adapter as cirrus vga. -- MST -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-commits] [COMMIT c0b1905] qemu/pci: reset device registers on bus reset
On 10/07/2009 09:53 PM, Michael S. Tsirkin wrote: On Wed, Oct 07, 2009 at 10:20:19AM +0200, Avi Kivity wrote: On 10/05/2009 04:53 PM, Anthony Liguori wrote: From: Michael S. Tsirkinm...@redhat.com Reset BARs and a couple of other registers on bus reset, as per PCI spec. This commit breaks Windows XP restart. After a restart Windows switches from 800x600 cirrus logic vga to 640x480 standard vga. My guess is that this is due to two mutually-cancelling bugs: Could you please tell me how to reproduce the problem exactly? I tried several configurations but in all of them windows seems to keep 800x600 across restarts, and shows display adapter as cirrus vga. Clean Windows XP install (-smp 2), boot, reboot. After restart it also loses S3 (so Standby is gr[ae]yed out in the shutdown menu). -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-commits] [COMMIT c0b1905] qemu/pci: reset device registers on bus reset
On Wed, Oct 07, 2009 at 10:01:19PM +0200, Avi Kivity wrote: On 10/07/2009 09:53 PM, Michael S. Tsirkin wrote: On Wed, Oct 07, 2009 at 10:20:19AM +0200, Avi Kivity wrote: On 10/05/2009 04:53 PM, Anthony Liguori wrote: From: Michael S. Tsirkinm...@redhat.com Reset BARs and a couple of other registers on bus reset, as per PCI spec. This commit breaks Windows XP restart. After a restart Windows switches from 800x600 cirrus logic vga to 640x480 standard vga. My guess is that this is due to two mutually-cancelling bugs: Could you please tell me how to reproduce the problem exactly? I tried several configurations but in all of them windows seems to keep 800x600 across restarts, and shows display adapter as cirrus vga. Clean Windows XP install (-smp 2), boot, reboot. After restart it also loses S3 (so Standby is gr[ae]yed out in the shutdown menu). OK, see it now, finally. I'll work on it. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-commits] [COMMIT c0b1905] qemu/pci: reset device registers on bus reset
On 10/07/2009 10:07 PM, Michael S. Tsirkin wrote: Clean Windows XP install (-smp 2), boot, reboot. After restart it also loses S3 (so Standby is gr[ae]yed out in the shutdown menu). OK, see it now, finally. I'll work on it. What changed? -smp 2? -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-commits] [COMMIT c0b1905] qemu/pci: reset device registers on bus reset
On Wed, Oct 07, 2009 at 10:12:59PM +0200, Avi Kivity wrote: On 10/07/2009 10:07 PM, Michael S. Tsirkin wrote: Clean Windows XP install (-smp 2), boot, reboot. After restart it also loses S3 (so Standby is gr[ae]yed out in the shutdown menu). OK, see it now, finally. I'll work on it. What changed? -smp 2? /me stopped being silly. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 03/10] KVM: SVM: Move nested INTR #vmexit into preemtible code
On Wed, Oct 07, 2009 at 04:31:21PM +0200, Joerg Roedel wrote: This patch makes use of the KVM_REQ_VMEXIT to move the emulation of #vmexit(INTR) out of non-preemptible code. Signed-off-by: Joerg Roedel joerg.roe...@amd.com --- arch/x86/kvm/svm.c | 18 -- 1 files changed, 16 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index b6ce1a9..7015680 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -1379,8 +1379,14 @@ static inline int nested_svm_intr(struct vcpu_svm *svm) svm-vmcb-control.exit_code = SVM_EXIT_INTR; - if (nested_svm_exit_handled(svm)) { - nsvm_printk(VMexit - INTR\n); + if (svm-nested.intercept 1ULL) { + /* + * The #vmexit can't be emulated here directly because this + * code path runs with irqs and preemtion disabled and a + * #vmexit emulation might sleep. Only set the request bit for + * the #vmexit here. + */ + set_bit(KVM_REQ_VMEXIT, svm-vcpu.requests); return 1; } What if you keep this internal to SVM? Proceed to svm_vcpu_run and return, do the emulation on the exit handler. Then there's no need for the request bit (VMX does that, see vmx_vcpu_run). -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: sync guest calls made async on host - SQLite performance
Original Message Subject: Re: sync guest calls made async on host - SQLite performance From: Avi Kivity a...@redhat.com To: Matthew Tippett tippe...@gmail.com Cc: Dustin Kirkland dustin.kirkl...@gmail.com, Anthony Liguori anth...@codemonkey.ws, RW k...@tauceti.net, kvm@vger.kernel.org Date: 10/07/2009 04:12 PM What is the data set for this benchmark? If it's much larger than guest RAM, but smaller than host RAM, you could be seeing the effects of read caching. 2GB for host, 1.7GB accessible for the guest (although highly unlikely that the memory usage went very high at all. Another possiblity is barriers and flushing. That is what I am expecting, remember that the host and the guest were the same OS, same config, nothing special. So the variable in the mix is how Ubuntu Karmic interacts with the bare metal vs the qemu-kvm virtual metal. The test itself is simply 12500 sequential inserts, designed to model a simple high-transactional load single-tier system. I still have some investigations pending on how sqlite responds at the syscall level, but I believe it is requesting synchronous writes and then doing many writes. The consequence of the structure of the benchmark is that if there is any caching occurring at all from the sqlite library down, then it tends to show. And I believe that it is unexpectedly showing here (since the writes are expected to be synchronous to a physical disk). If there is a clear rationale that the KVM community is comfortable with, then it becomes a distribution or deployment issue relative to data integrity where a synchronous write within a guest may not be synchronous to a physical disk. I assume this would concern commercial and server users of virtual machines. Regards, Matthew -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH qemu-kvm] Enable UFO on virtio-net and tap devices
linux 2.6.32 includes UDP fragmentation offload support in software. So we can enable UFO on the host tap device if supported and allow setting UFO on virtio-net in the guest. This improves UDP stream performance significantly between guest to host and inter-guest. TUN_F_UFO is a new #define added to 2.6.32 kernel header file include/linux/if_tun.h. Until this updated header file gets into distro releases, i think we need to have this defined in qemu. Signed-off-by: Sridhar Samudrala s...@us.ibm.com diff --git a/hw/virtio-net.c b/hw/virtio-net.c index ce8e6cb..c73487d 100644 --- a/hw/virtio-net.c +++ b/hw/virtio-net.c @@ -150,7 +150,8 @@ static uint32_t virtio_net_get_features(VirtIODevice *vdev) features |= (1 VIRTIO_NET_F_HOST_TSO6); features |= (1 VIRTIO_NET_F_HOST_ECN); features |= (1 VIRTIO_NET_F_MRG_RXBUF); -/* Kernel can't actually handle UFO in software currently. */ +features |= (1 VIRTIO_NET_F_GUEST_UFO); +features |= (1 VIRTIO_NET_F_HOST_UFO); } #endif @@ -189,7 +190,8 @@ static void virtio_net_set_features(VirtIODevice *vdev, uint32_t features) (features VIRTIO_NET_F_GUEST_CSUM) 1, (features VIRTIO_NET_F_GUEST_TSO4) 1, (features VIRTIO_NET_F_GUEST_TSO6) 1, - (features VIRTIO_NET_F_GUEST_ECN) 1); + (features VIRTIO_NET_F_GUEST_ECN) 1, + (features VIRTIO_NET_F_GUEST_UFO) 1); #endif } diff --git a/net.c b/net.c index 8032ff8..1942e25 100644 --- a/net.c +++ b/net.c @@ -1528,8 +1528,13 @@ static int tap_probe_vnet_hdr(int fd) } #ifdef TUNSETOFFLOAD + +#ifndef TUN_F_UFO +#define TUN_F_UFO 0x10 +#endif + static void tap_set_offload(VLANClientState *vc, int csum, int tso4, int tso6, - int ecn) + int ecn, int ufo) { TAPState *s = vc-opaque; unsigned int offload = 0; @@ -1542,11 +1547,18 @@ static void tap_set_offload(VLANClientState *vc, int csum, int tso4, int tso6, offload |= TUN_F_TSO6; if ((tso4 || tso6) ecn) offload |= TUN_F_TSO_ECN; + if (ufo) + offload |= TUN_F_UFO; } -if (ioctl(s-fd, TUNSETOFFLOAD, offload) != 0) - fprintf(stderr, TUNSETOFFLOAD ioctl() failed: %s\n, - strerror(errno)); +if (ioctl(s-fd, TUNSETOFFLOAD, offload) != 0) { +/* Try without UFO */ +offload = ~TUN_F_UFO; +if (ioctl(s-fd, TUNSETOFFLOAD, offload) != 0) { + fprintf(stderr, TUNSETOFFLOAD ioctl() failed: %s\n, + strerror(errno)); +} +} } #endif /* TUNSETOFFLOAD */ @@ -1583,7 +1595,7 @@ static TAPState *net_tap_fd_init(VLANState *vlan, s-vc-receive_raw = tap_receive_raw; #ifdef TUNSETOFFLOAD s-vc-set_offload = tap_set_offload; -tap_set_offload(s-vc, 0, 0, 0, 0); +tap_set_offload(s-vc, 0, 0, 0, 0, 0); #endif tap_read_poll(s, 1); snprintf(s-vc-info_str, sizeof(s-vc-info_str), fd=%d, fd); diff --git a/net.h b/net.h index 925c67c..ac3701c 100644 --- a/net.h +++ b/net.h @@ -14,7 +14,7 @@ typedef ssize_t (NetReceive)(VLANClientState *, const uint8_t *, size_t); typedef ssize_t (NetReceiveIOV)(VLANClientState *, const struct iovec *, int); typedef void (NetCleanup) (VLANClientState *); typedef void (LinkStatusChanged)(VLANClientState *); -typedef void (SetOffload)(VLANClientState *, int, int, int, int); +typedef void (SetOffload)(VLANClientState *, int, int, int, int, int); struct VLANClientState { NetReceive *receive; -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kvm guest: hrtimer: interrupt too slow
(Adding Thomas in Cc) On Sat, Oct 03, 2009 at 08:12:05PM -0300, Marcelo Tosatti wrote: Michael, Can you please give the patch below a try please? (without acpi_pm timer or priority adjustments for the guest). On Tue, Sep 29, 2009 at 05:12:17PM +0400, Michael Tokarev wrote: Hello. I'm having quite an.. unusable system here. It's not really a regresssion with 0.11.0, it was something similar before, but with 0.11.0 and/or 2.6.31 it become much worse. The thing is that after some uptime, kvm guest prints something like this: hrtimer: interrupt too slow, forcing clock min delta to 461487495 ns after which system (guest) speeed becomes very slow. The above message is from 2.6.31 guest running wiht 0.11.0 2.6.31 host. Before I tried it with 0.10.6 and 2.6.30 or 2.6.27, and the delta were a bit less than that: hrtimer: interrupt too slow, forcing clock min delta to 15415 ns hrtimer: interrupt too slow, forcing clock min delta to 93629025 ns It seems the way hrtimer_interrupt_hanging calculates min_delta is wrong (especially to virtual machines). The guest vcpu can be scheduled out during the execution of the hrtimer callbacks (and the callbacks themselves can do operations that translate to blocking operations in the hypervisor). So high min_delta values can be calculated if, for example, a single hrtimer_interrupt run takes two host time slices to execute, while some other higher priority task runs for N slices in between. Using the hrtimer_interrupt execution time (which can be the worse case at any given time), as the min_delta is problematic. So simply increase min_delta_ns by 50% once every detected failure, which will eventually lead to an acceptable threshold (the algorithm should scale back to down lower min_delta, to adjust back to wealthier times, too). diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c index 49da79a..8997978 100644 --- a/kernel/hrtimer.c +++ b/kernel/hrtimer.c @@ -1234,28 +1234,20 @@ static void __run_hrtimer(struct hrtimer *timer) #ifdef CONFIG_HIGH_RES_TIMERS -static int force_clock_reprogram; - /* * After 5 iteration's attempts, we consider that hrtimer_interrupt() * is hanging, which could happen with something that slows the interrupt - * such as the tracing. Then we force the clock reprogramming for each future - * hrtimer interrupts to avoid infinite loops and use the min_delta_ns - * threshold that we will overwrite. - * The next tick event will be scheduled to 3 times we currently spend on - * hrtimer_interrupt(). This gives a good compromise, the cpus will spend - * 1/4 of their time to process the hrtimer interrupts. This is enough to - * let it running without serious starvation. + * such as the tracing, so we increase min_delta_ns. */ static inline void -hrtimer_interrupt_hanging(struct clock_event_device *dev, - ktime_t try_time) +hrtimer_interrupt_hanging(struct clock_event_device *dev) { - force_clock_reprogram = 1; - dev-min_delta_ns = (unsigned long)try_time.tv64 * 3; - printk(KERN_WARNING hrtimer: interrupt too slow, - forcing clock min delta to %lu ns\n, dev-min_delta_ns); + dev-min_delta_ns += dev-min_delta_ns 1; I haven't thought about the guest that could be scheduled out in the middle of the timers servicing, making wrong this check based of the time spent in hrtimer_interrupt(). I guess there is no easy/generic/cheap way to rebase this check on the _virtual_ time spent in the timers servicing. By virtual, I mean the time spent in the guest only. In a non-guest kernel, the old check forces an adaptive rate sharing: - we spent n nanosecs to service the batch of timers. - we are hanging - we want at least 3/4 of time reserved for non-timer servicing in the kernel, this is a minimum prerequisite for the system to not starve - adapt the min_clock_delta against to fit the above constraint All that does not make sense anymore in a guest. The hang detection and warnings, the recalibrations of the min_clock_deltas are completely wrong in this context. Not only does it spuriously warn, but the minimum timer is increasing slowly and the guest progressively suffers from higher and higher latencies. That's really bad. Your patch lowers the immediate impact and makes this illness evolving smoother by scaling down the recalibration to the min_clock_delta. This appeases the bug but doesn't solve it. I fear it could be even worse because it makes it more discreet. May be can we instead increase the minimum threshold of loop in the hrtimer interrupt before considering it as a hang? Hmm, but a too high number could make this check useless, depending of the number of pending timers, which is a finite number. Well, actually I'm not confident anymore in this check. Or actually we should change it. May be we can rebase it on the time spent on the hrtimer interrupt (and check it every 10
buildbot failure in qemu-kvm on default_x86_64_out_of_tree
The Buildbot has detected a new failure of default_x86_64_out_of_tree on qemu-kvm. Full details are available at: http://buildbot.b1-systems.de/qemu-kvm/builders/default_x86_64_out_of_tree/builds/43 Buildbot URL: http://buildbot.b1-systems.de/qemu-kvm/ Buildslave for this Build: b1_qemu_kvm_1 Build Reason: The Nightly scheduler named 'nightly_default' triggered this build Build Source Stamp: [branch master] HEAD Blamelist: BUILD FAILED: failed git sincerely, -The Buildbot -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kvm guest: hrtimer: interrupt too slow
On Thu, Oct 08, 2009 at 01:17:35AM +0200, Frederic Weisbecker wrote: (Adding Thomas in Cc) On Sat, Oct 03, 2009 at 08:12:05PM -0300, Marcelo Tosatti wrote: Michael, Can you please give the patch below a try please? (without acpi_pm timer or priority adjustments for the guest). On Tue, Sep 29, 2009 at 05:12:17PM +0400, Michael Tokarev wrote: Hello. I'm having quite an.. unusable system here. It's not really a regresssion with 0.11.0, it was something similar before, but with 0.11.0 and/or 2.6.31 it become much worse. The thing is that after some uptime, kvm guest prints something like this: hrtimer: interrupt too slow, forcing clock min delta to 461487495 ns after which system (guest) speeed becomes very slow. The above message is from 2.6.31 guest running wiht 0.11.0 2.6.31 host. Before I tried it with 0.10.6 and 2.6.30 or 2.6.27, and the delta were a bit less than that: hrtimer: interrupt too slow, forcing clock min delta to 15415 ns hrtimer: interrupt too slow, forcing clock min delta to 93629025 ns It seems the way hrtimer_interrupt_hanging calculates min_delta is wrong (especially to virtual machines). The guest vcpu can be scheduled out during the execution of the hrtimer callbacks (and the callbacks themselves can do operations that translate to blocking operations in the hypervisor). So high min_delta values can be calculated if, for example, a single hrtimer_interrupt run takes two host time slices to execute, while some other higher priority task runs for N slices in between. Using the hrtimer_interrupt execution time (which can be the worse case at any given time), as the min_delta is problematic. So simply increase min_delta_ns by 50% once every detected failure, which will eventually lead to an acceptable threshold (the algorithm should scale back to down lower min_delta, to adjust back to wealthier times, too). diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c index 49da79a..8997978 100644 --- a/kernel/hrtimer.c +++ b/kernel/hrtimer.c @@ -1234,28 +1234,20 @@ static void __run_hrtimer(struct hrtimer *timer) #ifdef CONFIG_HIGH_RES_TIMERS -static int force_clock_reprogram; - /* * After 5 iteration's attempts, we consider that hrtimer_interrupt() * is hanging, which could happen with something that slows the interrupt - * such as the tracing. Then we force the clock reprogramming for each future - * hrtimer interrupts to avoid infinite loops and use the min_delta_ns - * threshold that we will overwrite. - * The next tick event will be scheduled to 3 times we currently spend on - * hrtimer_interrupt(). This gives a good compromise, the cpus will spend - * 1/4 of their time to process the hrtimer interrupts. This is enough to - * let it running without serious starvation. + * such as the tracing, so we increase min_delta_ns. */ static inline void -hrtimer_interrupt_hanging(struct clock_event_device *dev, - ktime_t try_time) +hrtimer_interrupt_hanging(struct clock_event_device *dev) { - force_clock_reprogram = 1; - dev-min_delta_ns = (unsigned long)try_time.tv64 * 3; - printk(KERN_WARNING hrtimer: interrupt too slow, - forcing clock min delta to %lu ns\n, dev-min_delta_ns); + dev-min_delta_ns += dev-min_delta_ns 1; I haven't thought about the guest that could be scheduled out in the middle of the timers servicing, making wrong this check based of the time spent in hrtimer_interrupt(). I guess there is no easy/generic/cheap way to rebase this check on the _virtual_ time spent in the timers servicing. By virtual, I mean the time spent in the guest only. In a non-guest kernel, the old check forces an adaptive rate sharing: - we spent n nanosecs to service the batch of timers. - we are hanging - we want at least 3/4 of time reserved for non-timer servicing in the kernel, this is a minimum prerequisite for the system to not starve - adapt the min_clock_delta against to fit the above constraint All that does not make sense anymore in a guest. The hang detection and warnings, the recalibrations of the min_clock_deltas are completely wrong in this context. Not only does it spuriously warn, but the minimum timer is increasing slowly and the guest progressively suffers from higher and higher latencies. That's really bad. Your patch lowers the immediate impact and makes this illness evolving smoother by scaling down the recalibration to the min_clock_delta. This appeases the bug but doesn't solve it. I fear it could be even worse because it makes it more discreet. True. May be can we instead increase the minimum threshold of loop in the hrtimer interrupt before considering it as a hang? Hmm, but a too high number could make this check useless, depending of the number of pending timers, which is a finite