[COMMIT master] Fix request_irq() for 2.6.19

2009-04-30 Thread Avi Kivity
From: Chris Wright chr...@sous-sol.org

The irq handler changes (introduced in 2.6.19, not 2.6.20) dropped
struct pt_regs from the handler prototype, they are found globally now.
This introduces the back compat for older kernels.  The handler is just
a thin layer which calls the real registered handler (all this to work
around a minor little compiler warning ;-)  Needed for device assignment
on older kernels.

Signed-off-by: Chris Wright chr...@redhat.com
Signed-off-by: Avi Kivity a...@redhat.com

diff --git a/external-module-compat-comm.h b/external-module-compat-comm.h
index c955927..8cb5440 100644
--- a/external-module-compat-comm.h
+++ b/external-module-compat-comm.h
@@ -641,19 +641,41 @@ static inline int pci_reset_function(struct pci_dev *dev)
 #endif
 
 #include linux/interrupt.h
-#if LINUX_VERSION_CODE  KERNEL_VERSION(2,6,20)
+#if LINUX_VERSION_CODE  KERNEL_VERSION(2,6,19)
+
+typedef irqreturn_t (*kvm_irq_handler_t)(int, void *);
+static kvm_irq_handler_t kvm_irq_handlers[NR_IRQS];
+
+static irqreturn_t kvm_irq_thunk(int irq, void *dev_id, struct pt_regs *regs)
+{
+   kvm_irq_handler_t handler = kvm_irq_handlers[irq];
+   return handler(irq, dev_id);
+}
 
-typedef irqreturn_t (*kvm_irq_handler_t)(int, void *, struct pt_regs *);
 static inline int kvm_request_irq(unsigned int a, kvm_irq_handler_t handler,
  unsigned long c, const char *d, void *e)
 {
-   /* FIXME: allocate thunk, etc. */
-   return -EINVAL;
+   int rc;
+   kvm_irq_handler_t old = kvm_irq_handlers[a];
+   if (old)
+   return -EBUSY;
+   kvm_irq_handlers[a] = handler;
+   rc = request_irq(a, kvm_irq_thunk, c, d, e);
+   if (rc)
+   kvm_irq_handlers[a] = NULL;
+   return rc;
+}
+
+static inline void kvm_free_irq(unsigned int irq, void *dev_id)
+{
+   free_irq(irq, dev_id);
+   kvm_irq_handlers[irq] = NULL;
 }
 
 #else
 
 #define kvm_request_irq request_irq
+#define kvm_free_irq free_irq
 
 #endif
 
diff --git a/ia64/hack-module.awk b/ia64/hack-module.awk
index 2e4e05f..dda3347 100644
--- a/ia64/hack-module.awk
+++ b/ia64/hack-module.awk
@@ -2,7 +2,7 @@ BEGIN { split(INIT_WORK on_each_cpu smp_call_function  \
  hrtimer_add_expires_ns hrtimer_get_expires  \
  hrtimer_get_expires_ns hrtimer_start_expires  \
  hrtimer_expires_remaining  \
- request_irq, compat_apis); }
+ request_irq free_irq, compat_apis); }
 
 /MODULE_AUTHOR/ {
 printf(MODULE_INFO(version, \%s\);\n, version)
diff --git a/x86/hack-module.awk b/x86/hack-module.awk
index 260eeef..bdb873a 100644
--- a/x86/hack-module.awk
+++ b/x86/hack-module.awk
@@ -2,7 +2,7 @@ BEGIN { split(INIT_WORK desc_struct ldttss_desc64 desc_ptr  \
  hrtimer_add_expires_ns hrtimer_get_expires  \
  hrtimer_get_expires_ns hrtimer_start_expires  \
  hrtimer_expires_remaining  \
- on_each_cpu relay_open request_irq , compat_apis); }
+ on_each_cpu relay_open request_irq free_irq , compat_apis); }
 
 /^int kvm_init\(/ { anon_inodes = 1 }
 
--
To unsubscribe from this list: send the line unsubscribe kvm-commits in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[COMMIT master] Set default configure options for ia64

2009-04-30 Thread Avi Kivity
From: Xiantao Zhang xiantao.zh...@intel.com

1. Disable xen config support for ia64.
2. Only configure ia64-softmmu for ia64.

Signed-off-by: Xiantao Zhang xiantao.zh...@intel.com
Signed-off-by: Avi Kivity a...@redhat.com

diff --git a/configure b/configure
index fc0fb9b..5f448a2 100755
--- a/configure
+++ b/configure
@@ -337,6 +337,8 @@ if [ $cpu = i386 -o $cpu = x86_64 ] ; then
 fi
 if [ $cpu = ia64 ] ; then
  kvm=yes
+ xen=no
+ target_list=ia64-softmmu
  cpu_emulation=no
  gdbstub=no
  slirp=no
--
To unsubscribe from this list: send the line unsubscribe kvm-commits in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[COMMIT master] exec.c: fix typo in comment (fluch - flush)

2009-04-30 Thread Avi Kivity
From: Sebastian Herbszt herb...@gmx.de

Fix typo in comment in exec.c (fluch - flush).

Signed-off-by: Sebastian Herbszt herb...@gmx.de
Signed-off-by: Marcelo Tosatti mtosa...@redhat.com

diff --git a/exec.c b/exec.c
index 16d3cf8..0420f29 100644
--- a/exec.c
+++ b/exec.c
@@ -3187,7 +3187,7 @@ void cpu_physical_memory_rw(target_phys_addr_t addr, 
uint8_t *buf,
 (0xff  ~CODE_DIRTY_FLAG);
 }
/* qemu doesn't execute guest code directly, but kvm does
-  therefore fluch instruction caches */
+  therefore flush instruction caches */
if (kvm_enabled())
flush_icache_range((unsigned long)ptr,
   ((unsigned long)ptr)+l);
--
To unsubscribe from this list: send the line unsubscribe kvm-commits in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[COMMIT master] kvm: Set kvm_arch=powerpc for PPC builds.

2009-04-30 Thread Avi Kivity
From: Hollis Blanchard holl...@us.ibm.com

The name of the Linux arch directory is powerpc, not ppc.

Signed-off-by: Hollis Blanchard holl...@us.ibm.com
Signed-off-by: Avi Kivity a...@redhat.com

diff --git a/configure b/configure
index 5f448a2..9a635ae 100755
--- a/configure
+++ b/configure
@@ -818,6 +818,9 @@ case $cpu in
 i386 | x86_64)
kvm_arch=x86
;;
+ppc)
+kvm_arch=powerpc
+;;
 *)
kvm_arch=$cpu
;;
--
To unsubscribe from this list: send the line unsubscribe kvm-commits in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[COMMIT master] Remove memalign call for guess_disk_lchs

2009-04-30 Thread Avi Kivity
From: Anthony Liguori aligu...@us.ibm.com

This code doesn't exist in upstream QEMU because it is not necessary to
provide an aligned buffer to bdrv_read.  The API has always worked this way
although at one point, the bouncing was broken which is what led to this
patch.

The places where qemu_memalign() is used in QEMU are only where performance is
sensitive.  guess_disk_lchs does not fall into this category.

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
Signed-off-by: Avi Kivity a...@redhat.com

diff --git a/block.c b/block.c
index 8e08f32..8348cf2 100644
--- a/block.c
+++ b/block.c
@@ -30,7 +30,6 @@
 #include qemu-common.h
 #include monitor.h
 #include block_int.h
-#include osdep.h
 
 #ifdef HOST_BSD
 #include sys/types.h
@@ -772,26 +771,20 @@ struct partition {
 static int guess_disk_lchs(BlockDriverState *bs,
int *pcylinders, int *pheads, int *psectors)
 {
-uint8_t *buf;
+uint8_t buf[512];
 int ret, i, heads, sectors, cylinders;
 struct partition *p;
 uint32_t nr_sects;
 uint64_t nb_sectors;
 
-buf = qemu_memalign(512, 512);
-if (buf == NULL)
-return -1;
-
 bdrv_get_geometry(bs, nb_sectors);
 
 ret = bdrv_read(bs, 0, buf, 1);
 if (ret  0)
 return -1;
 /* test msdos magic */
-if (buf[510] != 0x55 || buf[511] != 0xaa) {
-qemu_free(buf);
+if (buf[510] != 0x55 || buf[511] != 0xaa)
 return -1;
-}
 for(i = 0; i  4; i++) {
 p = ((struct partition *)(buf + 0x1be)) + i;
 nr_sects = le32_to_cpu(p-nr_sects);
@@ -812,11 +805,9 @@ static int guess_disk_lchs(BlockDriverState *bs,
 printf(guessed geometry: LCHS=%d %d %d\n,
cylinders, heads, sectors);
 #endif
-qemu_free(buf);
 return 0;
 }
 }
-qemu_free(buf);
 return -1;
 }
 
--
To unsubscribe from this list: send the line unsubscribe kvm-commits in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[COMMIT master] Remove unnecessary setting of cmos smp_cpu count

2009-04-30 Thread Avi Kivity
From: Anthony Liguori aligu...@us.ibm.com

This is duplicate code.

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
Signed-off-by: Avi Kivity a...@redhat.com

diff --git a/hw/pc.c b/hw/pc.c
index 35f9527..db34f53 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -265,7 +265,6 @@ static void cmos_init(ram_addr_t ram_size, ram_addr_t 
above_4g_mem_size,
 rtc_set_memory(s, 0x5c, (unsigned int)above_4g_mem_size  24);
 rtc_set_memory(s, 0x5d, (uint64_t)above_4g_mem_size  32);
 }
-rtc_set_memory(s, 0x5f, smp_cpus - 1);
 
 if (ram_size  (16 * 1024 * 1024))
 val = (ram_size / 65536) - ((16 * 1024 * 1024) / 65536);
--
To unsubscribe from this list: send the line unsubscribe kvm-commits in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[COMMIT master] Remove dead macros likely/unlikely in exec.c

2009-04-30 Thread Avi Kivity
From: Anthony Liguori aligu...@us.ibm.com

More left overs from the old migration code.

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
Signed-off-by: Avi Kivity a...@redhat.com

diff --git a/exec.c b/exec.c
index 0420f29..0c5545e 100644
--- a/exec.c
+++ b/exec.c
@@ -3500,14 +3500,6 @@ uint32_t lduw_phys(target_phys_addr_t addr)
 return tswap16(val);
 }
 
-#ifdef __GNUC__
-#define likely(x) __builtin_expect(!!(x), 1)
-#define unlikely(x) __builtin_expect(!!(x), 0)
-#else
-#define likely(x) x
-#define unlikely(x) x
-#endif
-
 /* warning: addr must be aligned. The ram page is not masked as dirty
and the code inside is not invalidated. It is useful if the dirty
bits are used to track modified PTEs */
--
To unsubscribe from this list: send the line unsubscribe kvm-commits in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[COMMIT master] Remove stray whitespace

2009-04-30 Thread Avi Kivity
From: Anthony Liguori aligu...@us.ibm.com

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
Signed-off-by: Avi Kivity a...@redhat.com

diff --git a/block-vmdk.c b/block-vmdk.c
index ff5007c..d47d483 100644
--- a/block-vmdk.c
+++ b/block-vmdk.c
@@ -93,6 +93,7 @@ typedef struct ActiveBDRVState{
 
 static ActiveBDRVState activeBDRV;
 
+
 static int vmdk_probe(const uint8_t *buf, int buf_size, const char *filename)
 {
 uint32_t magic;
diff --git a/vl.c b/vl.c
index 9ff4a5a..15f85e2 100644
--- a/vl.c
+++ b/vl.c
@@ -21,7 +21,6 @@
  * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
  * THE SOFTWARE.
  */
-
 #include unistd.h
 #include fcntl.h
 #include signal.h
--
To unsubscribe from this list: send the line unsubscribe kvm-commits in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[COMMIT master] Remove devfn from BlockDriverState

2009-04-30 Thread Avi Kivity
From: Anthony Liguori aligu...@us.ibm.com

It's no longer used.

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
Signed-off-by: Avi Kivity a...@redhat.com

diff --git a/block_int.h b/block_int.h
index 951ff02..e10b906 100644
--- a/block_int.h
+++ b/block_int.h
@@ -150,8 +150,6 @@ struct BlockDriverState {
 int cyls, heads, secs, translation;
 int type;
 char device_name[32];
-/* PCI devfn of parent */
-int devfn;
 BlockDriverState *next;
 void *private;
 };
--
To unsubscribe from this list: send the line unsubscribe kvm-commits in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[COMMIT master] Remove IBM copyright in unmodified file in upstream

2009-04-30 Thread Avi Kivity
From: Anthony Liguori aligu...@us.ibm.com

Presumably, it would carry an IBM copyright upstream if needed.  qemu-kvm
introduces no additional code over upstream QEMU in this file.

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
Acked-by: Hollis Blanchard holl...@us.ibm.com
Signed-off-by: Avi Kivity a...@redhat.com

diff --git a/hw/ppc4xx.h b/hw/ppc4xx.h
index 25a91bd..7832cd9 100644
--- a/hw/ppc4xx.h
+++ b/hw/ppc4xx.h
@@ -3,9 +3,6 @@
  *
  * Copyright (c) 2007 Jocelyn Mayer
  *
- * Copyright 2008 IBM Corp.
- * Authors: Hollis Blanchard holl...@us.ibm.com
- *
  * Permission is hereby granted, free of charge, to any person obtaining a copy
  * of this software and associated documentation files (the Software), to 
deal
  * in the Software without restriction, including without limitation the rights
--
To unsubscribe from this list: send the line unsubscribe kvm-commits in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[COMMIT master] Re-add vga dirty logging bits dropped by merge 02d4417f75

2009-04-30 Thread Avi Kivity
From: Avi Kivity a...@redhat.com

The last qemu.git merge broke vga.  Revert the vga changes pending
better dirty logging support.

Signed-off-by: Avi Kivity a...@redhat.com

diff --git a/hw/cirrus_vga.c b/hw/cirrus_vga.c
index 3e67acd..20f17a6 100644
--- a/hw/cirrus_vga.c
+++ b/hw/cirrus_vga.c
@@ -2619,6 +2619,7 @@ static CPUWriteMemoryFunc *cirrus_linear_bitblt_write[3] 
= {
 
 static void map_linear_vram(CirrusVGAState *s)
 {
+vga_dirty_log_stop((VGAState *)s);
 if (!s-map_addr  s-lfb_addr  s-lfb_end) {
 s-map_addr = s-lfb_addr;
 s-map_end = s-lfb_end;
@@ -2631,11 +2632,16 @@ static void map_linear_vram(CirrusVGAState *s)
 #ifndef TARGET_IA64
 s-lfb_vram_mapped = 0;
 
+cpu_register_physical_memory(isa_mem_base + 0xa, 0x8000,
+(s-vram_offset + s-cirrus_bank_base[0]) | 
IO_MEM_UNASSIGNED);
+cpu_register_physical_memory(isa_mem_base + 0xa8000, 0x8000,
+(s-vram_offset + s-cirrus_bank_base[1]) | 
IO_MEM_UNASSIGNED);
 if (!(s-cirrus_srcptr != s-cirrus_srcptr_end)
  !((s-sr[0x07]  0x01) == 0)
  !((s-gr[0x0B]  0x14) == 0x14)
  !(s-gr[0x0B]  0x02)) {
 
+vga_dirty_log_stop((VGAState *)s);
 cpu_register_physical_memory(isa_mem_base + 0xa, 0x8000,
 (s-vram_offset + s-cirrus_bank_base[0]) 
| IO_MEM_RAM);
 cpu_register_physical_memory(isa_mem_base + 0xa8000, 0x8000,
@@ -2654,11 +2660,14 @@ static void map_linear_vram(CirrusVGAState *s)
 
 static void unmap_linear_vram(CirrusVGAState *s)
 {
+vga_dirty_log_stop((VGAState *)s);
 if (s-map_addr  s-lfb_addr  s-lfb_end)
 s-map_addr = s-map_end = 0;
 
 cpu_register_physical_memory(isa_mem_base + 0xa, 0x2,
  s-vga_io_memory);
+
+vga_dirty_log_start((VGAState *)s);
 }
 
 /* Compute the memory access functions */
@@ -3305,6 +3314,8 @@ static void cirrus_pci_lfb_map(PCIDevice *d, int 
region_num,
 {
 CirrusVGAState *s = ((PCICirrusVGAState *)d)-cirrus_vga;
 
+vga_dirty_log_stop((VGAState *)s);
+
 /* XXX: add byte swapping apertures */
 cpu_register_physical_memory(addr, s-vram_size,
 s-cirrus_linear_io_addr);
@@ -3336,10 +3347,14 @@ static void pci_cirrus_write_config(PCIDevice *d,
 PCICirrusVGAState *pvs = container_of(d, PCICirrusVGAState, dev);
 CirrusVGAState *s = pvs-cirrus_vga;
 
+vga_dirty_log_stop((VGAState *)s);
+
 pci_default_write_config(d, address, val, len);
 if (s-map_addr  pvs-dev.io_regions[0].addr == -1)
 s-map_addr = 0;
 cirrus_update_memory_access(s);
+
+vga_dirty_log_start((VGAState *)s);
 }
 
 void pci_cirrus_vga_init(PCIBus *bus, int vga_ram_size)
diff --git a/hw/vga.c b/hw/vga.c
index 4931b69..9ab6e1a 100644
--- a/hw/vga.c
+++ b/hw/vga.c
@@ -1280,6 +1280,8 @@ static void vga_draw_text(VGAState *s, int full_update)
 vga_draw_glyph8_func *vga_draw_glyph8;
 vga_draw_glyph9_func *vga_draw_glyph9;
 
+vga_dirty_log_stop(s);
+
 /* compute font data address (in plane 2) */
 v = s-sr[3];
 offset = (((v  4)  1) | ((v  1)  6)) * 8192 * 4 + 2;
@@ -1578,6 +1580,7 @@ static void vga_sync_dirty_bitmap(VGAState *s)
 cpu_physical_sync_dirty_bitmap(isa_mem_base + 0xa, 0xa8000);
 cpu_physical_sync_dirty_bitmap(isa_mem_base + 0xa8000, 0xb);
 }
+vga_dirty_log_start(s);
 }
 
 /*
@@ -1809,6 +1812,7 @@ static void vga_draw_blank(VGAState *s, int full_update)
 return;
 if (s-last_scr_width = 0 || s-last_scr_height = 0)
 return;
+vga_dirty_log_stop(s);
 
 s-rgb_to_pixel =
 rgb_to_pixel_dup_table[get_depth_index(s-ds)];
@@ -2258,6 +2262,18 @@ void vga_dirty_log_start(VGAState *s)
 }
 }
 
+void vga_dirty_log_stop(VGAState *s)
+{
+if (kvm_enabled()  s-map_addr  s1)
+kvm_log_stop(s-map_addr, s-map_end - s-map_addr);
+
+if (kvm_enabled()  s-lfb_vram_mapped  s2) {
+kvm_log_stop(isa_mem_base + 0xa, 0x8000);
+kvm_log_stop(isa_mem_base + 0xa8000, 0x8000);
+}
+s1 = s2 = 0;
+}
+
 static void vga_map(PCIDevice *pci_dev, int region_num,
 uint32_t addr, uint32_t size, int type)
 {
@@ -2267,10 +2283,12 @@ static void vga_map(PCIDevice *pci_dev, int region_num,
 cpu_register_physical_memory(addr, s-bios_size, s-bios_offset);
 } else {
 cpu_register_physical_memory(addr, s-vram_size, s-vram_offset);
-s-map_addr = addr;
-s-map_end = addr + s-vram_size;
-vga_dirty_log_start(s);
 }
+
+s-map_addr = addr;
+s-map_end = addr + VGA_RAM_SIZE;
+
+vga_dirty_log_start(s);
 }
 
 void vga_common_init(VGAState *s, int vga_ram_size)
@@ -2498,9 +2516,11 @@ static void pci_vga_write_config(PCIDevice *d,
 PCIVGAState *pvs = container_of(d, PCIVGAState, dev);
 VGAState *s = pvs-vga_state;
 
+vga_dirty_log_stop(s);
 pci_default_write_config(d, 

[COMMIT master] kvm: Add header files for ia64

2009-04-30 Thread Avi Kivity
From: Xiantao Zhang xiantao.zh...@intel.com

Signed-off-by: Xiantao Zhang xiantao.zh...@intel.com
Signed-off-by: Avi Kivity a...@redhat.com

diff --git a/kvm/kernel/arch/ia64/include/asm/kvm.h 
b/kvm/kernel/arch/ia64/include/asm/kvm.h
new file mode 100644
index 000..73963e3
--- /dev/null
+++ b/kvm/kernel/arch/ia64/include/asm/kvm.h
@@ -0,0 +1,303 @@
+#ifndef KVM_UNIFDEF_H
+#define KVM_UNIFDEF_H
+
+#ifdef __i386__
+#ifndef CONFIG_X86_32
+#define CONFIG_X86_32 1
+#endif
+#endif
+
+#ifdef __x86_64__
+#ifndef CONFIG_X86_64
+#define CONFIG_X86_64 1
+#endif
+#endif
+
+#if defined(__i386__) || defined (__x86_64__)
+#ifndef CONFIG_X86
+#define CONFIG_X86 1
+#endif
+#endif
+
+#ifdef __ia64__
+#ifndef CONFIG_IA64
+#define CONFIG_IA64 1
+#endif
+#endif
+
+#ifdef __PPC__
+#ifndef CONFIG_PPC
+#define CONFIG_PPC 1
+#endif
+#endif
+
+#ifdef __s390__
+#ifndef CONFIG_S390
+#define CONFIG_S390 1
+#endif
+#endif
+
+#endif
+#ifndef __ASM_IA64_KVM_H
+#define __ASM_IA64_KVM_H
+
+/*
+ * kvm structure definitions  for ia64
+ *
+ * Copyright (C) 2007 Xiantao Zhang xiantao.zh...@intel.com
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
+ * Place - Suite 330, Boston, MA 02111-1307 USA.
+ *
+ */
+
+#include asm/types.h
+#include linux/ioctl.h
+
+/* Select x86 specific features in linux/kvm.h */
+#define __KVM_HAVE_IOAPIC
+#define __KVM_HAVE_DEVICE_ASSIGNMENT
+
+/* Architectural interrupt line count. */
+#define KVM_NR_INTERRUPTS 256
+
+#define KVM_IOAPIC_NUM_PINS  48
+
+struct kvm_ioapic_state {
+   __u64 base_address;
+   __u32 ioregsel;
+   __u32 id;
+   __u32 irr;
+   __u32 pad;
+   union {
+   __u64 bits;
+   struct {
+   __u8 vector;
+   __u8 delivery_mode:3;
+   __u8 dest_mode:1;
+   __u8 delivery_status:1;
+   __u8 polarity:1;
+   __u8 remote_irr:1;
+   __u8 trig_mode:1;
+   __u8 mask:1;
+   __u8 reserve:7;
+   __u8 reserved[4];
+   __u8 dest_id;
+   } fields;
+   } redirtbl[KVM_IOAPIC_NUM_PINS];
+};
+
+#define KVM_IRQCHIP_PIC_MASTER   0
+#define KVM_IRQCHIP_PIC_SLAVE1
+#define KVM_IRQCHIP_IOAPIC   2
+
+#define KVM_CONTEXT_SIZE   8*1024
+
+struct kvm_fpreg {
+   union {
+   unsigned long bits[2];
+   long double __dummy;/* force 16-byte alignment */
+   } u;
+};
+
+union context {
+   /* 8K size */
+   chardummy[KVM_CONTEXT_SIZE];
+   struct {
+   unsigned long   psr;
+   unsigned long   pr;
+   unsigned long   caller_unat;
+   unsigned long   pad;
+   unsigned long   gr[32];
+   unsigned long   ar[128];
+   unsigned long   br[8];
+   unsigned long   cr[128];
+   unsigned long   rr[8];
+   unsigned long   ibr[8];
+   unsigned long   dbr[8];
+   unsigned long   pkr[8];
+   struct kvm_fpreg   fr[128];
+   };
+};
+
+struct thash_data {
+   union {
+   struct {
+   unsigned long p:  1; /* 0 */
+   unsigned long rv1  :  1; /* 1 */
+   unsigned long ma   :  3; /* 2-4 */
+   unsigned long a:  1; /* 5 */
+   unsigned long d:  1; /* 6 */
+   unsigned long pl   :  2; /* 7-8 */
+   unsigned long ar   :  3; /* 9-11 */
+   unsigned long ppn  : 38; /* 12-49 */
+   unsigned long rv2  :  2; /* 50-51 */
+   unsigned long ed   :  1; /* 52 */
+   unsigned long ig1  : 11; /* 53-63 */
+   };
+   struct {
+   unsigned long __rv1 : 53; /* 0-52 */
+   unsigned long contiguous : 1; /*53 */
+   unsigned long tc : 1; /* 54 TR or TC */
+   unsigned long cl : 1;
+   /* 55 I side or D side cache line */
+   unsigned long len  :  4;  /* 56-59 */
+   unsigned long io  : 1;  /* 60 entry is for io or not */
+

[COMMIT master] Leave upstream QEMU comments intact

2009-04-30 Thread Avi Kivity
From: Anthony Liguori aligu...@us.ibm.com

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
Signed-off-by: Avi Kivity a...@redhat.com

diff --git a/vl.c b/vl.c
index 1d568bd..fbc84a7 100644
--- a/vl.c
+++ b/vl.c
@@ -5488,10 +5488,10 @@ int main(int argc, char **argv, char **envp)
 if (bt_parse(bt_opts[i]))
 exit(1);
 
+/* init the memory */
 if (ram_size == 0)
 ram_size = DEFAULT_RAM_SIZE * 1024 * 1024;
 
-/* init the memory */
 if (kvm_enabled()) {
if (kvm_qemu_create_context()  0) {
fprintf(stderr, Could not create KVM context\n);
--
To unsubscribe from this list: send the line unsubscribe kvm-commits in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[COMMIT master] Remove #define __user in usb-linux.c

2009-04-30 Thread Avi Kivity
From: Anthony Liguori aligu...@us.ibm.com

The Makefile defines __user, so this is unnecessary.

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
Signed-off-by: Avi Kivity a...@redhat.com

diff --git a/usb-linux.c b/usb-linux.c
index 26643bd..70d7a1c 100644
--- a/usb-linux.c
+++ b/usb-linux.c
@@ -34,10 +34,6 @@
 #include qemu-timer.h
 #include monitor.h
 
-#if defined(__linux__)
-#define __user
-#endif
-
 #include dirent.h
 #include sys/ioctl.h
 #include signal.h
--
To unsubscribe from this list: send the line unsubscribe kvm-commits in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[COMMIT master] KVM: Make kvm_cpu_(has|get)_interrupt() work for userspace irqchip too

2009-04-30 Thread Avi Kivity
From: Gleb Natapov g...@redhat.com

At the vector level, kernel and userspace irqchip are fairly similar.

Signed-off-by: Gleb Natapov g...@redhat.com
Signed-off-by: Avi Kivity a...@redhat.com

diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
index cf17ed5..11c2757 100644
--- a/arch/x86/kvm/irq.c
+++ b/arch/x86/kvm/irq.c
@@ -24,6 +24,7 @@
 
 #include irq.h
 #include i8254.h
+#include x86.h
 
 /*
  * check if there are pending timer events
@@ -48,6 +49,9 @@ int kvm_cpu_has_interrupt(struct kvm_vcpu *v)
 {
struct kvm_pic *s;
 
+   if (!irqchip_in_kernel(v-kvm))
+   return v-arch.irq_summary;
+
if (kvm_apic_has_interrupt(v) == -1) {  /* LAPIC */
if (kvm_apic_accept_pic_intr(v)) {
s = pic_irqchip(v-kvm);/* PIC */
@@ -67,6 +71,9 @@ int kvm_cpu_get_interrupt(struct kvm_vcpu *v)
struct kvm_pic *s;
int vector;
 
+   if (!irqchip_in_kernel(v-kvm))
+   return kvm_pop_irq(v);
+
vector = kvm_get_apic_interrupt(v); /* APIC */
if (vector == -1) {
if (kvm_apic_accept_pic_intr(v)) {
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 053f3c5..1903c27 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -2089,8 +2089,9 @@ static int interrupt_window_interception(struct vcpu_svm 
*svm,
 * If the user space waits to inject interrupts, exit as soon as
 * possible
 */
-   if (kvm_run-request_interrupt_window 
-   !svm-vcpu.arch.irq_summary) {
+   if (!irqchip_in_kernel(svm-vcpu.kvm) 
+   kvm_run-request_interrupt_window 
+   !kvm_cpu_has_interrupt(svm-vcpu)) {
++svm-vcpu.stat.irq_window_exits;
kvm_run-exit_reason = KVM_EXIT_IRQ_WINDOW_OPEN;
return 0;
@@ -2371,7 +2372,8 @@ static void do_interrupt_requests(struct kvm_vcpu *vcpu,
 (svm-vmcb-save.rflags  X86_EFLAGS_IF) 
 (svm-vcpu.arch.hflags  HF_GIF_MASK));
 
-   if (svm-vcpu.arch.interrupt_window_open  svm-vcpu.arch.irq_summary)
+   if (svm-vcpu.arch.interrupt_window_open 
+   kvm_cpu_has_interrupt(svm-vcpu))
/*
 * If interrupts enabled, and not blocked by sti or mov ss. 
Good.
 */
@@ -2381,7 +2383,8 @@ static void do_interrupt_requests(struct kvm_vcpu *vcpu,
 * Interrupts blocked.  Wait for unblock.
 */
if (!svm-vcpu.arch.interrupt_window_open 
-   (svm-vcpu.arch.irq_summary || kvm_run-request_interrupt_window))
+   (kvm_cpu_has_interrupt(svm-vcpu) ||
+kvm_run-request_interrupt_window))
svm_set_vintr(svm);
else
svm_clear_vintr(svm);
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index c6997c0..b3292c1 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2535,21 +2535,20 @@ static void do_interrupt_requests(struct kvm_vcpu *vcpu,
vmx_inject_nmi(vcpu);
if (vcpu-arch.nmi_pending)
enable_nmi_window(vcpu);
-   else if (vcpu-arch.irq_summary
-|| kvm_run-request_interrupt_window)
+   else if (kvm_cpu_has_interrupt(vcpu) ||
+kvm_run-request_interrupt_window)
enable_irq_window(vcpu);
return;
}
 
if (vcpu-arch.interrupt_window_open) {
-   if (vcpu-arch.irq_summary  !vcpu-arch.interrupt.pending)
-   kvm_queue_interrupt(vcpu, kvm_pop_irq(vcpu));
+   if (kvm_cpu_has_interrupt(vcpu)  
!vcpu-arch.interrupt.pending)
+   kvm_queue_interrupt(vcpu, kvm_cpu_get_interrupt(vcpu));
 
if (vcpu-arch.interrupt.pending)
vmx_inject_irq(vcpu, vcpu-arch.interrupt.nr);
-   }
-   if (!vcpu-arch.interrupt_window_open 
-   (vcpu-arch.irq_summary || kvm_run-request_interrupt_window))
+   } else if(kvm_cpu_has_interrupt(vcpu) ||
+ kvm_run-request_interrupt_window)
enable_irq_window(vcpu);
 }
 
@@ -2976,8 +2975,9 @@ static int handle_interrupt_window(struct kvm_vcpu *vcpu,
 * If the user space waits to inject interrupts, exit as soon as
 * possible
 */
-   if (kvm_run-request_interrupt_window 
-   !vcpu-arch.irq_summary) {
+   if (!irqchip_in_kernel(vcpu-kvm) 
+   kvm_run-request_interrupt_window 
+   !kvm_cpu_has_interrupt(vcpu)) {
kvm_run-exit_reason = KVM_EXIT_IRQ_WINDOW_OPEN;
return 0;
}
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ab1fdac..8c730ad 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3065,7 +3065,7 @@ EXPORT_SYMBOL_GPL(kvm_emulate_cpuid);
 static int dm_request_for_irq_injection(struct kvm_vcpu *vcpu,
  struct kvm_run *kvm_run)
 {
-   

[COMMIT master] KVM: VMX: Consolidate userspace and kernel interrupt injection for VMX

2009-04-30 Thread Avi Kivity
From: Gleb Natapov g...@redhat.com

Use the same callback to inject irq/nmi events no matter what irqchip is
in use. Only from VMX for now.

Signed-off-by: Gleb Natapov g...@redhat.com
Signed-off-by: Avi Kivity a...@redhat.com

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index cb306cf..5edae35 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -517,7 +517,7 @@ struct kvm_x86_ops {
void (*queue_exception)(struct kvm_vcpu *vcpu, unsigned nr,
bool has_error_code, u32 error_code);
bool (*exception_injected)(struct kvm_vcpu *vcpu);
-   void (*inject_pending_irq)(struct kvm_vcpu *vcpu);
+   void (*inject_pending_irq)(struct kvm_vcpu *vcpu, struct kvm_run *run);
void (*inject_pending_vectors)(struct kvm_vcpu *vcpu,
   struct kvm_run *run);
int (*interrupt_allowed)(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 1903c27..674a249 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -2296,7 +2296,7 @@ static int svm_interrupt_allowed(struct kvm_vcpu *vcpu)
(svm-vcpu.arch.hflags  HF_GIF_MASK);
 }
 
-static void svm_intr_assist(struct kvm_vcpu *vcpu)
+static void svm_intr_assist(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 {
struct vcpu_svm *svm = to_svm(vcpu);
struct vmcb *vmcb = svm-vmcb;
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index b3292c1..06252f7 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2510,48 +2510,6 @@ static int vmx_interrupt_allowed(struct kvm_vcpu *vcpu)
return vcpu-arch.interrupt_window_open;
 }
 
-static void do_interrupt_requests(struct kvm_vcpu *vcpu,
-  struct kvm_run *kvm_run)
-{
-   vmx_update_window_states(vcpu);
-
-   if (vcpu-guest_debug  KVM_GUESTDBG_SINGLESTEP)
-   vmcs_clear_bits(GUEST_INTERRUPTIBILITY_INFO,
-   GUEST_INTR_STATE_STI |
-   GUEST_INTR_STATE_MOV_SS);
-
-   if (vcpu-arch.nmi_pending  !vcpu-arch.nmi_injected) {
-   if (vcpu-arch.interrupt.pending) {
-   enable_nmi_window(vcpu);
-   } else if (vcpu-arch.nmi_window_open) {
-   vcpu-arch.nmi_pending = false;
-   vcpu-arch.nmi_injected = true;
-   } else {
-   enable_nmi_window(vcpu);
-   return;
-   }
-   }
-   if (vcpu-arch.nmi_injected) {
-   vmx_inject_nmi(vcpu);
-   if (vcpu-arch.nmi_pending)
-   enable_nmi_window(vcpu);
-   else if (kvm_cpu_has_interrupt(vcpu) ||
-kvm_run-request_interrupt_window)
-   enable_irq_window(vcpu);
-   return;
-   }
-
-   if (vcpu-arch.interrupt_window_open) {
-   if (kvm_cpu_has_interrupt(vcpu)  
!vcpu-arch.interrupt.pending)
-   kvm_queue_interrupt(vcpu, kvm_cpu_get_interrupt(vcpu));
-
-   if (vcpu-arch.interrupt.pending)
-   vmx_inject_irq(vcpu, vcpu-arch.interrupt.nr);
-   } else if(kvm_cpu_has_interrupt(vcpu) ||
- kvm_run-request_interrupt_window)
-   enable_irq_window(vcpu);
-}
-
 static int vmx_set_tss_addr(struct kvm *kvm, unsigned int addr)
 {
int ret;
@@ -3351,8 +3309,11 @@ static void vmx_complete_interrupts(struct vcpu_vmx *vmx)
}
 }
 
-static void vmx_intr_assist(struct kvm_vcpu *vcpu)
+static void vmx_intr_assist(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 {
+   bool req_int_win = !irqchip_in_kernel(vcpu-kvm) 
+   kvm_run-request_interrupt_window;
+
update_tpr_threshold(vcpu);
 
vmx_update_window_states(vcpu);
@@ -3373,25 +3334,25 @@ static void vmx_intr_assist(struct kvm_vcpu *vcpu)
return;
}
}
+
if (vcpu-arch.nmi_injected) {
vmx_inject_nmi(vcpu);
-   if (vcpu-arch.nmi_pending)
-   enable_nmi_window(vcpu);
-   else if (kvm_cpu_has_interrupt(vcpu))
-   enable_irq_window(vcpu);
-   return;
+   goto out;
}
+
if (!vcpu-arch.interrupt.pending  kvm_cpu_has_interrupt(vcpu)) {
if (vcpu-arch.interrupt_window_open)
kvm_queue_interrupt(vcpu, kvm_cpu_get_interrupt(vcpu));
-   else
-   enable_irq_window(vcpu);
}
-   if (vcpu-arch.interrupt.pending) {
+
+   if (vcpu-arch.interrupt.pending)
vmx_inject_irq(vcpu, vcpu-arch.interrupt.nr);
-   if (kvm_cpu_has_interrupt(vcpu))
-   enable_irq_window(vcpu);
-   }
+
+out:
+   if (vcpu-arch.nmi_pending)
+   

[COMMIT master] KVM: Remove exception_injected() callback.

2009-04-30 Thread Avi Kivity
From: Gleb Natapov g...@redhat.com

It always return false for VMX/SVM now.

Signed-off-by: Gleb Natapov g...@redhat.com
Signed-off-by: Avi Kivity a...@redhat.com

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 5edae35..ea3741e 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -516,7 +516,6 @@ struct kvm_x86_ops {
void (*set_irq)(struct kvm_vcpu *vcpu, int vec);
void (*queue_exception)(struct kvm_vcpu *vcpu, unsigned nr,
bool has_error_code, u32 error_code);
-   bool (*exception_injected)(struct kvm_vcpu *vcpu);
void (*inject_pending_irq)(struct kvm_vcpu *vcpu, struct kvm_run *run);
void (*inject_pending_vectors)(struct kvm_vcpu *vcpu,
   struct kvm_run *run);
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 7b6ab16..872787b 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -196,11 +196,6 @@ static void svm_queue_exception(struct kvm_vcpu *vcpu, 
unsigned nr,
svm-vmcb-control.event_inj_err = error_code;
 }
 
-static bool svm_exception_injected(struct kvm_vcpu *vcpu)
-{
-   return false;
-}
-
 static int is_external_interrupt(u32 info)
 {
info = SVM_EVTINJ_TYPE_MASK | SVM_EVTINJ_VALID;
@@ -2657,7 +2652,6 @@ static struct kvm_x86_ops svm_x86_ops = {
.get_irq = svm_get_irq,
.set_irq = svm_set_irq,
.queue_exception = svm_queue_exception,
-   .exception_injected = svm_exception_injected,
.inject_pending_irq = svm_intr_assist,
.inject_pending_vectors = svm_intr_assist,
.interrupt_allowed = svm_interrupt_allowed,
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 9eb518f..3186fcf 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -789,11 +789,6 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu, 
unsigned nr,
vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, intr_info);
 }
 
-static bool vmx_exception_injected(struct kvm_vcpu *vcpu)
-{
-   return false;
-}
-
 /*
  * Swap MSR entry in host/guest MSR entry array.
  */
@@ -3697,7 +3692,6 @@ static struct kvm_x86_ops vmx_x86_ops = {
.get_irq = vmx_get_irq,
.set_irq = vmx_inject_irq,
.queue_exception = vmx_queue_exception,
-   .exception_injected = vmx_exception_injected,
.inject_pending_irq = vmx_intr_assist,
.inject_pending_vectors = vmx_intr_assist,
.interrupt_allowed = vmx_interrupt_allowed,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 0ecd238..f20e1e4 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3233,8 +3233,6 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu, struct 
kvm_run *kvm_run)
profile_hit(KVM_PROFILING, (void *)rip);
}
 
-   if (vcpu-arch.exception.pending  
kvm_x86_ops-exception_injected(vcpu))
-   vcpu-arch.exception.pending = false;
 
kvm_lapic_sync_from_vapic(vcpu);
 
--
To unsubscribe from this list: send the line unsubscribe kvm-commits in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[COMMIT master] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6

2009-04-30 Thread Avi Kivity
From: Avi Kivity a...@redhat.com

Conflicts:
arch/ia64/kvm/kvm-ia64.c
arch/x86/kvm/mmu.c
include/linux/kvm.h

Signed-off-by: Avi Kivity a...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe kvm-commits in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[COMMIT master] KVM: Use kvm_arch_interrupt_allowed() instead of checking interrupt_window_open directly

2009-04-30 Thread Avi Kivity
From: Gleb Natapov g...@redhat.com

kvm_arch_interrupt_allowed() also checks IF so drop the check.

Signed-off-by: Gleb Natapov g...@redhat.com
Signed-off-by: Avi Kivity a...@redhat.com

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 1727829..0ecd238 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3067,8 +3067,7 @@ static int dm_request_for_irq_injection(struct kvm_vcpu 
*vcpu,
 {
return (!irqchip_in_kernel(vcpu-kvm)  !kvm_cpu_has_interrupt(vcpu) 
kvm_run-request_interrupt_window 
-   vcpu-arch.interrupt_window_open 
-   (kvm_x86_ops-get_rflags(vcpu)  X86_EFLAGS_IF));
+   kvm_arch_interrupt_allowed(vcpu));
 }
 
 static void post_kvm_run_save(struct kvm_vcpu *vcpu,
@@ -3081,7 +3080,7 @@ static void post_kvm_run_save(struct kvm_vcpu *vcpu,
kvm_run-ready_for_interrupt_injection = 1;
else
kvm_run-ready_for_interrupt_injection =
-   (vcpu-arch.interrupt_window_open 
+   (kvm_arch_interrupt_allowed(vcpu) 
 !kvm_cpu_has_interrupt(vcpu));
 }
 
--
To unsubscribe from this list: send the line unsubscribe kvm-commits in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[COMMIT master] KVM: Remove inject_pending_vectors() callback

2009-04-30 Thread Avi Kivity
From: Gleb Natapov g...@redhat.com

It is the same as inject_pending_irq() for VMX/SVM now.

Signed-off-by: Gleb Natapov g...@redhat.com
Signed-off-by: Avi Kivity a...@redhat.com

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index ea3741e..aa5a54e 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -517,8 +517,6 @@ struct kvm_x86_ops {
void (*queue_exception)(struct kvm_vcpu *vcpu, unsigned nr,
bool has_error_code, u32 error_code);
void (*inject_pending_irq)(struct kvm_vcpu *vcpu, struct kvm_run *run);
-   void (*inject_pending_vectors)(struct kvm_vcpu *vcpu,
-  struct kvm_run *run);
int (*interrupt_allowed)(struct kvm_vcpu *vcpu);
int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
int (*get_tdp_level)(void);
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 872787b..d0e4d98 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -2653,7 +2653,6 @@ static struct kvm_x86_ops svm_x86_ops = {
.set_irq = svm_set_irq,
.queue_exception = svm_queue_exception,
.inject_pending_irq = svm_intr_assist,
-   .inject_pending_vectors = svm_intr_assist,
.interrupt_allowed = svm_interrupt_allowed,
 
.set_tss_addr = svm_set_tss_addr,
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 3186fcf..9162b4c 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -3693,7 +3693,6 @@ static struct kvm_x86_ops vmx_x86_ops = {
.set_irq = vmx_inject_irq,
.queue_exception = vmx_queue_exception,
.inject_pending_irq = vmx_intr_assist,
-   .inject_pending_vectors = vmx_intr_assist,
.interrupt_allowed = vmx_interrupt_allowed,
.set_tss_addr = vmx_set_tss_addr,
.get_tdp_level = get_ept_level,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f20e1e4..1770b02 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3167,10 +3167,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu, 
struct kvm_run *kvm_run)
 
if (vcpu-arch.exception.pending)
__queue_exception(vcpu);
-   else if (irqchip_in_kernel(vcpu-kvm))
-   kvm_x86_ops-inject_pending_irq(vcpu, kvm_run);
else
-   kvm_x86_ops-inject_pending_vectors(vcpu, kvm_run);
+   kvm_x86_ops-inject_pending_irq(vcpu, kvm_run);
 
kvm_lapic_sync_to_vapic(vcpu);
 
--
To unsubscribe from this list: send the line unsubscribe kvm-commits in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[COMMIT master] KVM: SVM: Add NMI injection support

2009-04-30 Thread Avi Kivity
From: Gleb Natapov g...@redhat.com

Signed-off-by: Gleb Natapov g...@redhat.com
Signed-off-by: Avi Kivity a...@redhat.com

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 53533ea..eb140aa 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -763,6 +763,7 @@ enum {
 #define HF_GIF_MASK(1  0)
 #define HF_HIF_MASK(1  1)
 #define HF_VINTR_MASK  (1  2)
+#define HF_NMI_MASK(1  3)
 
 /*
  * Hardware virtualization extension instructions may fault if a
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 1b09ef5..50c1db9 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1841,6 +1841,14 @@ static int cpuid_interception(struct vcpu_svm *svm, 
struct kvm_run *kvm_run)
return 1;
 }
 
+static int iret_interception(struct vcpu_svm *svm, struct kvm_run *kvm_run)
+{
+   ++svm-vcpu.stat.nmi_window_exits;
+   svm-vmcb-control.intercept = ~(1UL  INTERCEPT_IRET);
+   svm-vcpu.arch.hflags = ~HF_NMI_MASK;
+   return 1;
+}
+
 static int invlpg_interception(struct vcpu_svm *svm, struct kvm_run *kvm_run)
 {
if (emulate_instruction(svm-vcpu, kvm_run, 0, 0, 0) != EMULATE_DONE)
@@ -2118,6 +2126,7 @@ static int (*svm_exit_handlers[])(struct vcpu_svm *svm,
[SVM_EXIT_VINTR]= interrupt_window_interception,
/* [SVM_EXIT_CR0_SEL_WRITE] = emulate_on_interception, */
[SVM_EXIT_CPUID]= cpuid_interception,
+   [SVM_EXIT_IRET] = iret_interception,
[SVM_EXIT_INVD] = emulate_on_interception,
[SVM_EXIT_HLT]  = halt_interception,
[SVM_EXIT_INVLPG]   = invlpg_interception,
@@ -2225,6 +2234,13 @@ static void pre_svm_run(struct vcpu_svm *svm)
new_asid(svm, svm_data);
 }
 
+static void svm_inject_nmi(struct vcpu_svm *svm)
+{
+   svm-vmcb-control.event_inj = SVM_EVTINJ_VALID | SVM_EVTINJ_TYPE_NMI;
+   vcpu-arch.hflags |= HF_NMI_MASK;
+   svm-vmcb-control.intercept |= (1UL  INTERCEPT_IRET);
+   ++vcpu-stat.nmi_injections;
+}
 
 static inline void svm_inject_irq(struct vcpu_svm *svm, int irq)
 {
@@ -2276,6 +2292,14 @@ static void update_cr8_intercept(struct kvm_vcpu *vcpu)
vmcb-control.intercept_cr_write |= INTERCEPT_CR8_MASK;
 }
 
+static int svm_nmi_allowed(struct kvm_vcpu *vcpu)
+{
+   struct vcpu_svm *svm = to_svm(vcpu);
+   struct vmcb *vmcb = svm-vmcb;
+   return !(vmcb-control.int_state  SVM_INTERRUPT_SHADOW_MASK) 
+   !(svm-vcpu.arch.hflags  HF_NMI_MASK);
+}
+
 static int svm_interrupt_allowed(struct kvm_vcpu *vcpu)
 {
struct vcpu_svm *svm = to_svm(vcpu);
@@ -2291,16 +2315,35 @@ static void enable_irq_window(struct kvm_vcpu *vcpu)
svm_inject_irq(to_svm(vcpu), 0x0);
 }
 
+static void enable_nmi_window(struct kvm_vcpu *vcpu)
+{
+   struct vcpu_svm *svm = to_svm(vcpu);
+
+   if (svm-vmcb-control.int_state  SVM_INTERRUPT_SHADOW_MASK)
+   enable_irq_window(vcpu);
+}
+
 static void svm_intr_inject(struct kvm_vcpu *vcpu)
 {
/* try to reinject previous events if any */
+   if (vcpu-arch.nmi_injected) {
+   svm_inject_nmi(to_svm(vcpu));
+   return;
+   }
+
if (vcpu-arch.interrupt.pending) {
svm_queue_irq(to_svm(vcpu), vcpu-arch.interrupt.nr);
return;
}
 
/* try to inject new event if pending */
-   if (kvm_cpu_has_interrupt(vcpu)) {
+   if (vcpu-arch.nmi_pending) {
+   if (svm_nmi_allowed(vcpu)) {
+   vcpu-arch.nmi_pending = false;
+   vcpu-arch.nmi_injected = true;
+   svm_inject_nmi(vcpu);
+   }
+   } else if (kvm_cpu_has_interrupt(vcpu)) {
if (svm_interrupt_allowed(vcpu)) {
kvm_queue_interrupt(vcpu, kvm_cpu_get_interrupt(vcpu));
svm_queue_irq(to_svm(vcpu), vcpu-arch.interrupt.nr);
@@ -2319,7 +2362,10 @@ static void svm_intr_assist(struct kvm_vcpu *vcpu, 
struct kvm_run *kvm_run)
 
svm_intr_inject(vcpu);
 
-   if (kvm_cpu_has_interrupt(vcpu) || req_int_win)
+   /* enable NMI/IRQ window open exits if needed */
+   if (vcpu-arch.nmi_pending)
+   enable_nmi_window(vcpu);
+   else if (kvm_cpu_has_interrupt(vcpu) || req_int_win)
enable_irq_window(vcpu);
 
 out:
--
To unsubscribe from this list: send the line unsubscribe kvm-commits in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[COMMIT master] KVM: Do not report TPR write to userspace if new value bigger or equal to a previous one.

2009-04-30 Thread Avi Kivity
From: Gleb Natapov g...@redhat.com

Saves many exits to userspace in a case of IRQ chip in userspace.

Signed-off-by: Gleb Natapov g...@redhat.com
Signed-off-by: Avi Kivity a...@redhat.com

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 52c99a8..a85a145 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1860,9 +1860,13 @@ static int emulate_on_interception(struct vcpu_svm *svm,
 
 static int cr8_write_interception(struct vcpu_svm *svm, struct kvm_run 
*kvm_run)
 {
+   u8 cr8_prev = kvm_get_cr8(svm-vcpu);
+   /* instruction emulation calls kvm_set_cr8() */
emulate_instruction(svm-vcpu, NULL, 0, 0, 0);
if (irqchip_in_kernel(svm-vcpu.kvm))
return 1;
+   if (cr8_prev = kvm_get_cr8(svm-vcpu))
+   return 1;
kvm_run-exit_reason = KVM_EXIT_SET_TPR;
return 0;
 }
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 9162b4c..51f804c 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2724,13 +2724,18 @@ static int handle_cr(struct kvm_vcpu *vcpu, struct 
kvm_run *kvm_run)
kvm_set_cr4(vcpu, kvm_register_read(vcpu, reg));
skip_emulated_instruction(vcpu);
return 1;
-   case 8:
-   kvm_set_cr8(vcpu, kvm_register_read(vcpu, reg));
-   skip_emulated_instruction(vcpu);
-   if (irqchip_in_kernel(vcpu-kvm))
-   return 1;
-   kvm_run-exit_reason = KVM_EXIT_SET_TPR;
-   return 0;
+   case 8: {
+   u8 cr8_prev = kvm_get_cr8(vcpu);
+   u8 cr8 = kvm_register_read(vcpu, reg);
+   kvm_set_cr8(vcpu, cr8);
+   skip_emulated_instruction(vcpu);
+   if (irqchip_in_kernel(vcpu-kvm))
+   return 1;
+   if (cr8_prev = cr8)
+   return 1;
+   kvm_run-exit_reason = KVM_EXIT_SET_TPR;
+   return 0;
+   }
};
break;
case 2: /* clts */
--
To unsubscribe from this list: send the line unsubscribe kvm-commits in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[COMMIT master] KVM: Enable snooping control for supported hardware

2009-04-30 Thread Avi Kivity
From: Sheng Yang sh...@linux.intel.com

Memory aliases with different memory type is a problem for guest. For the guest
without assigned device, the memory type of guest memory would always been the
same as host(WB); but for the assigned device, some part of memory may be used
as DMA and then set to uncacheable memory type(UC/WC), which would be a 
conflict of
host memory type then be a potential issue.

Snooping control can guarantee the cache correctness of memory go through the
DMA engine of VT-d.

Signed-off-by: Sheng Yang sh...@linux.intel.com
Signed-off-by: Avi Kivity a...@redhat.com

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 8a6f6b6..8e680c3 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -393,6 +393,8 @@ struct kvm_arch{
struct list_head active_mmu_pages;
struct list_head assigned_dev_head;
struct iommu_domain *iommu_domain;
+#define KVM_IOMMU_CACHE_COHERENCY  0x1
+   int iommu_flags;
struct kvm_pic *vpic;
struct kvm_ioapic *vioapic;
struct kvm_pit *vpit;
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 59b080c..e8a5649 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -3581,11 +3581,26 @@ static u64 vmx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t 
gfn, bool is_mmio)
 {
u64 ret;
 
+   /* For VT-d and EPT combination
+* 1. MMIO: always map as UC
+* 2. EPT with VT-d:
+*   a. VT-d without snooping control feature: can't guarantee the
+*  result, try to trust guest.
+*   b. VT-d with snooping control feature: snooping control feature of
+*  VT-d engine can guarantee the cache correctness. Just set it
+*  to WB to keep consistent with host. So the same as item 3.
+* 3. EPT without VT-d: always map as WB and set IGMT=1 to keep
+*consistent with host MTRR
+*/
if (is_mmio)
ret = MTRR_TYPE_UNCACHABLE  VMX_EPT_MT_EPTE_SHIFT;
+   else if (vcpu-kvm-arch.iommu_domain 
+   !(vcpu-kvm-arch.iommu_flags  KVM_IOMMU_CACHE_COHERENCY))
+   ret = kvm_get_guest_memory_type(vcpu, gfn) 
+ VMX_EPT_MT_EPTE_SHIFT;
else
-   ret = (kvm_get_guest_memory_type(vcpu, gfn) 
-   VMX_EPT_MT_EPTE_SHIFT) | VMX_EPT_IGMT_BIT;
+   ret = (MTRR_TYPE_WRBACK  VMX_EPT_MT_EPTE_SHIFT)
+   | VMX_EPT_IGMT_BIT;
 
return ret;
 }
diff --git a/virt/kvm/iommu.c b/virt/kvm/iommu.c
index 4c40375..1514758 100644
--- a/virt/kvm/iommu.c
+++ b/virt/kvm/iommu.c
@@ -39,11 +39,16 @@ int kvm_iommu_map_pages(struct kvm *kvm,
pfn_t pfn;
int i, r = 0;
struct iommu_domain *domain = kvm-arch.iommu_domain;
+   int flags;
 
/* check if iommu exists and in use */
if (!domain)
return 0;
 
+   flags = IOMMU_READ | IOMMU_WRITE;
+   if (kvm-arch.iommu_flags  KVM_IOMMU_CACHE_COHERENCY)
+   flags |= IOMMU_CACHE;
+
for (i = 0; i  npages; i++) {
/* check if already mapped */
if (iommu_iova_to_phys(domain, gfn_to_gpa(gfn)))
@@ -53,8 +58,7 @@ int kvm_iommu_map_pages(struct kvm *kvm,
r = iommu_map_range(domain,
gfn_to_gpa(gfn),
pfn_to_hpa(pfn),
-   PAGE_SIZE,
-   IOMMU_READ | IOMMU_WRITE);
+   PAGE_SIZE, flags);
if (r) {
printk(KERN_ERR kvm_iommu_map_address:
   iommu failed to map pfn=%lx\n, pfn);
@@ -88,7 +92,7 @@ int kvm_assign_device(struct kvm *kvm,
 {
struct pci_dev *pdev = NULL;
struct iommu_domain *domain = kvm-arch.iommu_domain;
-   int r;
+   int r, last_flags;
 
/* check if iommu exists and in use */
if (!domain)
@@ -107,12 +111,29 @@ int kvm_assign_device(struct kvm *kvm,
return r;
}
 
+   last_flags = kvm-arch.iommu_flags;
+   if (iommu_domain_has_cap(kvm-arch.iommu_domain,
+IOMMU_CAP_CACHE_COHERENCY))
+   kvm-arch.iommu_flags |= KVM_IOMMU_CACHE_COHERENCY;
+
+   /* Check if need to update IOMMU page table for guest memory */
+   if ((last_flags ^ kvm-arch.iommu_flags) ==
+   KVM_IOMMU_CACHE_COHERENCY) {
+   kvm_iommu_unmap_memslots(kvm);
+   r = kvm_iommu_map_memslots(kvm);
+   if (r)
+   goto out_unmap;
+   }
+
printk(KERN_DEBUG assign device: host bdf = %x:%x:%x\n,
assigned_dev-host_busnr,
PCI_SLOT(assigned_dev-host_devfn),
PCI_FUNC(assigned_dev-host_devfn));
 
return 0;
+out_unmap:
+   kvm_iommu_unmap_memslots(kvm);
+   

Implement generic double fault generation mechanism

2009-04-30 Thread Dong, Eddie


Move Double-Fault generation logic out of page fault
exception generating function to cover more generic case.

Signed-off-by: Eddie Dong eddie.d...@intel.com

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ab1fdac..51a8dad 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -162,12 +162,59 @@ void kvm_set_apic_base(struct kvm_vcpu *vcpu, u64 data)
 }
 EXPORT_SYMBOL_GPL(kvm_set_apic_base);
 
+#define EXCPT_BENIGN   0
+#define EXCPT_CONTRIBUTORY 1
+#define EXCPT_PF   2
+
+static int exception_class(int vector)
+{
+   if (vector == 14)
+   return EXCPT_PF;
+   else if (vector == 0 || (vector = 10  vector = 13))
+   return EXCPT_CONTRIBUTORY;
+   else
+   return EXCPT_BENIGN;
+}
+
+static void kvm_multiple_exception(struct kvm_vcpu *vcpu,
+   unsigned nr, bool has_error, u32 error_code)
+{
+   u32 prev_nr;
+   int class1, class2;
+
+   if (!vcpu-arch.exception.pending) {
+   vcpu-arch.exception.pending = true;
+   vcpu-arch.exception.has_error_code = has_error;
+   vcpu-arch.exception.nr = nr;
+   vcpu-arch.exception.error_code = error_code;
+   return;
+   }
+
+   /* to check exception */
+   prev_nr = vcpu-arch.exception.nr;
+   class2 = exception_class(nr);
+   class1 = exception_class(prev_nr);
+   if ((class1 == EXCPT_CONTRIBUTORY  class2 == EXCPT_CONTRIBUTORY)
+   || (class1 == EXCPT_PF  class2 != EXCPT_BENIGN)) {
+   /* generate double fault per SDM Table 5-5 */
+   printk(KERN_DEBUG kvm: double fault 0x%x on 0x%x\n,
+   prev_nr, nr);
+   vcpu-arch.exception.pending = true;
+   vcpu-arch.exception.has_error_code = 1;
+   vcpu-arch.exception.nr = DF_VECTOR;
+   vcpu-arch.exception.error_code = 0;
+   if (prev_nr == DF_VECTOR) {
+   /* triple fault - shutdown */
+   set_bit(KVM_REQ_TRIPLE_FAULT, vcpu-requests);
+   }
+   } else
+   printk(KERN_ERR Exception 0x%x on 0x%x happens serially\n,
+   prev_nr, nr);
+}
+
 void kvm_queue_exception(struct kvm_vcpu *vcpu, unsigned nr)
 {
-   WARN_ON(vcpu-arch.exception.pending);
-   vcpu-arch.exception.pending = true;
-   vcpu-arch.exception.has_error_code = false;
-   vcpu-arch.exception.nr = nr;
+   kvm_multiple_exception(vcpu, nr, false, 0);
 }
 EXPORT_SYMBOL_GPL(kvm_queue_exception);
 
@@ -176,18 +223,6 @@ void kvm_inject_page_fault(struct kvm_vcpu *vcpu, unsigned 
long addr,
 {
++vcpu-stat.pf_guest;
 
-   if (vcpu-arch.exception.pending) {
-   if (vcpu-arch.exception.nr == PF_VECTOR) {
-   printk(KERN_DEBUG kvm: inject_page_fault:
-double fault 0x%lx\n, addr);
-   vcpu-arch.exception.nr = DF_VECTOR;
-   vcpu-arch.exception.error_code = 0;
-   } else if (vcpu-arch.exception.nr == DF_VECTOR) {
-   /* triple fault - shutdown */
-   set_bit(KVM_REQ_TRIPLE_FAULT, vcpu-requests);
-   }
-   return;
-   }
vcpu-arch.cr2 = addr;
kvm_queue_exception_e(vcpu, PF_VECTOR, error_code);
 }
@@ -200,11 +235,7 @@ EXPORT_SYMBOL_GPL(kvm_inject_nmi);
 
 void kvm_queue_exception_e(struct kvm_vcpu *vcpu, unsigned nr, u32 error_code)
 {
-   WARN_ON(vcpu-arch.exception.pending);
-   vcpu-arch.exception.pending = true;
-   vcpu-arch.exception.has_error_code = true;
-   vcpu-arch.exception.nr = nr;
-   vcpu-arch.exception.error_code = error_code;
+   kvm_multiple_exception(vcpu, nr, true, error_code);
 }
 EXPORT_SYMBOL_GPL(kvm_queue_exception_e);
 

irq3.patch
Description: irq3.patch


[ kvm-Bugs-2638990 ] Segfault 284

2009-04-30 Thread SourceForge.net
Bugs item #2638990, was opened at 2009-02-25 23:35
Message generated for change (Comment added) made by ivanvimes
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detailatid=893831aid=2638990group_id=180599

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 6
Private: No
Submitted By: David Rasche (drasche2)
Assigned to: Nobody/Anonymous (nobody)
Summary: Segfault 284

Initial Comment:
Host
(2) Intel Xeon (E5430) Quad Core Processors (2.66GHz)
16G mem
Host OS: Ubuntu 8.10 (64bit)
kvm-72
libvirt 0.4.4

Guest OS Win2k3 Server (32 bit)

After running for 8 to 48 hours, Win2k3 guest system crashes with no warning. 
Syslog shows the following segmentation fault:

Feb 25 16:12:02 host-b kernel: [448190.415857] kvm[25511]: segfault at 284 ip 
0043
386f sp 7fff97fa3a70 error 4 in kvm[40+19e000]

this error has been confirmed on 2 different machines with exactly the same 
setup.

We are running KVM through libvirt with the following xml setup.

domain type='kvm'
  nameexchange/name
  uuide8d93082-c1db-426c-9ad3-ae651095ceb5/uuid
  memory4096000/memory
  currentMemory4096000/currentMemory
  vcpu3/vcpu
  os
typehvm/type
boot dev='hd'/
  /os
  features
acpi/
  /features
  clock offset='localtime'/
  on_poweroffdestroy/on_poweroff
  on_rebootrestart/on_reboot
  on_crashdestroy/on_crash
  devices
emulator/usr/bin/kvm/emulator
disk type='file' device='disk'
  source file='/mnt/vg0/lvol3/exchange.qcow2'/
  target dev='hda' bus='ide'/
/disk
disk type='block' device='disk'
  source dev='/dev/vg1/lv_exchdb'/
  target dev='hdb' bus='ide'/
/disk
disk type='file' device='cdrom'
  target dev='hdc' bus='ide'/
  readonly/
/disk
disk type='block' device='disk'
  source dev='/dev/vg2/lv_exchlog'/
  target dev='hdd' bus='ide'/
/disk
interface type='bridge'
  mac address='00:0c:29:cf:71:e4'/
  source bridge='br0'/
/interface
input type='tablet' bus='usb'/
input type='mouse' bus='ps2'/
graphics type='vnc' port='5900' listen='127.0.0.1'/
  /devices
/domain



--

Comment By: Simon Jagoe (ivanvimes)
Date: 2009-04-30 09:09

Message:
I am running an Ubuntu Hardy guest in KVM-72 (Ubuntu Intrepid host), and
got a similar segfault:

Apr 30 04:16:37 hare kernel: [726803.676199] kvm[4930]: segfault at 284 ip
0043386f sp 7fff1d240dd0 error 4 in kvm[40+19e000]

My hardware details are as follows:

HP ProLiant ML110 G5
Intel Xeon CPU 3065  2.33GHz (Dual core)
4GB RAM

I have four guests on the system, all Ubuntu Hardy. Only one of these
appears to have crashed. It is allocated one VCPU and 1024 MB of RAM.

The others are:
  * 2 VCPUs  1024MB RAM
  * 2 VCPUs  256MB RAM
  * 1 VCPU  256MB RAM

Additionally, I am also using libvirt. My (crashed) domain's XML looks
like this:

domain type='kvm'
  namepartridge/name
  uuid4f3bae26-359b-f633-9476-9d95fc2160b0/uuid
  memory1048576/memory
  currentMemory1048576/currentMemory
  vcpu1/vcpu
  os
typehvm/type
boot dev='hd'/
  /os
  features
acpi/
  /features
  clock offset='utc'/
  on_poweroffdestroy/on_poweroff
  on_rebootrestart/on_reboot
  on_crashdestroy/on_crash
  devices
emulator/usr/bin/kvm/emulator
disk type='block' device='disk'
  source dev='/dev/hare/partridge_root'/
  target dev='hda' bus='ide'/
/disk
disk type='block' device='disk'
  source dev='/dev/hare/partridge_home'/
  target dev='hdd' bus='ide'/
/disk
disk type='block' device='disk'
  source dev='/dev/hare/partridge_opt'/
  target dev='hdc' bus='ide'/
/disk
disk type='block' device='disk'
  source dev='/dev/hare/partridge_var'/
  target dev='hdb' bus='ide'/
/disk
interface type='bridge'
  mac address='00:16:3e:30:99:7c'/
  source bridge='br0'/
/interface
input type='mouse' bus='ps2'/
graphics type='vnc' port='5900' listen='127.0.0.1'/
  /devices
/domain

Please let me know if I can provide more information on this. I will
likely upgrade to Ubuntu Jaunty this week (and with it KVM-84).

--

Comment By: David Rasche (drasche2)
Date: 2009-03-09 17:07

Message:
I just updated one of our machines with KVM-84 and I am getting the exact
same segfault as reported above. This is cricital. We are trying to run a
win2k3 server with exchange and it keeps crashing. 

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detailatid=893831aid=2638990group_id=180599
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info 

Re: Missing symlink in qemu-kvm.git?

2009-04-30 Thread Avi Kivity

walt wrote:

When building on x86 I get this error:

make[2]: Entering directory `/home/wa1ter/src/qemu-kvm/kvm/libkvm'
make[2]: *** No rule to make target
`/home/wa1ter/src/qemu-kvm/kvm/kernel/include/asm/kvm.h', needed by `libkvm.o'.

I fixed it by adding the same symlink that I add to Linus's kernel.git for
exactly the same reason:

#cd qemu-kvm/kvm/kernel/include
#ln -s ../arch/x86/include/asm asm  [there was no asm directory here]

Am I the only one who has this problem?

  


This is already fixed. Pull again and retry.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Boot problems with qemu-kvm

2009-04-30 Thread Avi Kivity

Xu, Jiajun wrote:

Yes. If booting guest with -no-kvm, X display can work well. And I am using 
bridge network, so still can not get network up. :(
And qemu cpu utilization is still ~100%.

  


The last merge with qemu.git broke both vga and networking. Mark fixed 
networking, we're still looking at vga.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 1/2] Add MCE simulation support to qemu/tcg

2009-04-30 Thread Huang Ying
- MCE features are initialized when VCPU is intialized according to CPUID.
- A monitor command mce is added to inject a MCE.
- A new interrupt mask: CPU_INTERRUPT_MCE is added to inject the MCE.

Signed-off-by: Huang Ying ying.hu...@intel.com

---
 cpu-all.h   |4 ++
 cpu-exec.c  |4 ++
 monitor.c   |   49 +
 target-i386/cpu.h   |   22 +++
 target-i386/helper.c|   70 
 target-i386/op_helper.c |   34 +++
 6 files changed, 183 insertions(+)

--- a/target-i386/cpu.h
+++ b/target-i386/cpu.h
@@ -202,6 +202,7 @@
 #define CR4_DE_MASK   (1  3)
 #define CR4_PSE_MASK  (1  4)
 #define CR4_PAE_MASK  (1  5)
+#define CR4_MCE_MASK  (1  6)
 #define CR4_PGE_MASK  (1  7)
 #define CR4_PCE_MASK  (1  8)
 #define CR4_OSFXSR_SHIFT 9
@@ -248,6 +249,17 @@
 #define PG_ERROR_RSVD_MASK 0x08
 #define PG_ERROR_I_D_MASK  0x10
 
+#define MCE_CAP_DEF0x100
+#define MCE_BANKS_DEF  4
+
+#define MCG_CTL_P  (1UL8)
+
+#define MCG_STATUS_MCIP(1UL2)
+
+#define MCI_STATUS_VAL (1UL63)
+#define MCI_STATUS_OVER(1UL62)
+#define MCI_STATUS_UC  (1UL61)
+
 #define MSR_IA32_TSC0x10
 #define MSR_IA32_APICBASE   0x1b
 #define MSR_IA32_APICBASE_BSP   (18)
@@ -288,6 +300,11 @@
 
 #define MSR_MTRRdefType0x2ff
 
+#define MSR_MC0_CTL0x400
+#define MSR_MC0_STATUS 0x401
+#define MSR_MC0_ADDR   0x402
+#define MSR_MC0_MISC   0x403
+
 #define MSR_EFER0xc080
 
 #define MSR_EFER_SCE   (1  0)
@@ -673,6 +690,11 @@ typedef struct CPUX86State {
 /* in order to simplify APIC support, we leave this pointer to the
user */
 struct APICState *apic_state;
+
+uint64 mcg_cap;
+uint64 mcg_status;
+uint64 mcg_ctl;
+uint64 *mce_banks;
 } CPUX86State;
 
 CPUX86State *cpu_x86_init(const char *cpu_model);
--- a/target-i386/op_helper.c
+++ b/target-i386/op_helper.c
@@ -3133,7 +3133,23 @@ void helper_wrmsr(void)
 case MSR_MTRRdefType:
 env-mtrr_deftype = val;
 break;
+case MSR_MCG_STATUS:
+env-mcg_status = val;
+break;
+case MSR_MCG_CTL:
+if ((env-mcg_cap  MCG_CTL_P)
+ (val == 0 || val == ~(uint64_t)0))
+env-mcg_ctl = val;
+break;
 default:
+if ((uint32_t)ECX = MSR_MC0_CTL
+ (uint32_t)ECX  MSR_MC0_CTL + (4 * env-mcg_cap  0xff)) {
+uint32_t offset = (uint32_t)ECX - MSR_MC0_CTL;
+if ((offset  0x3) != 0
+|| (val == 0 || val == ~(uint64_t)0))
+env-mce_banks[offset] = val;
+break;
+}
 /* XXX: exception ? */
 break;
 }
@@ -3252,7 +3268,25 @@ void helper_rdmsr(void)
 /* XXX: exception ? */
 val = 0;
 break;
+case MSR_MCG_CAP:
+val = env-mcg_cap;
+break;
+case MSR_MCG_CTL:
+if (env-mcg_cap  MCG_CTL_P)
+val = env-mcg_ctl;
+else
+val = 0;
+break;
+case MSR_MCG_STATUS:
+val = env-mcg_status;
+break;
 default:
+if ((uint32_t)ECX = MSR_MC0_CTL
+ (uint32_t)ECX  MSR_MC0_CTL + (4 * env-mcg_cap  0xff)) {
+uint32_t offset = (uint32_t)ECX - MSR_MC0_CTL;
+val = env-mce_banks[offset];
+break;
+}
 /* XXX: exception ? */
 val = 0;
 break;
--- a/target-i386/helper.c
+++ b/target-i386/helper.c
@@ -1430,6 +1430,75 @@ static void breakpoint_handler(CPUState 
 }
 #endif /* !CONFIG_USER_ONLY */
 
+/* This should come from sysemu.h - if we could include it here... */
+void qemu_system_reset_request(void);
+
+void cpu_inject_x86_mce(CPUState *cenv, int bank, uint64_t status,
+uint64_t mcg_status, uint64_t addr, uint64_t misc)
+{
+uint64_t mcg_cap = cenv-mcg_cap;
+unsigned bank_num = mcg_cap  0xff;
+uint64_t *banks = cenv-mce_banks;
+
+if (bank = bank_num || !(status  MCI_STATUS_VAL))
+return;
+
+/*
+ * if MSR_MCG_CTL is not all 1s, the uncorrected error
+ * reporting is disabled
+ */
+if ((status  MCI_STATUS_UC)  (mcg_cap  MCG_CTL_P) 
+cenv-mcg_ctl != ~(uint64_t)0)
+return;
+banks += 4 * bank;
+/*
+ * if MSR_MCi_CTL is not all 1s, the uncorrected error
+ * reporting is disabled for the bank
+ */
+if ((status  MCI_STATUS_UC)  banks[0] != ~(uint64_t)0)
+return;
+if (status  MCI_STATUS_UC) {
+if ((cenv-mcg_status  MCG_STATUS_MCIP) ||
+!(cenv-cr[4]  CR4_MCE_MASK)) {
+fprintf(stderr, injects mce exception while previous 
+one is in progress!\n);
+qemu_log_mask(CPU_LOG_RESET, Triple fault\n);
+qemu_system_reset_request();
+return;
+  

[RFC 2/2] Add MCE simulation support to qemu/kvm

2009-04-30 Thread Huang Ying
MCE features are detected, initialized and injected via the
corresponding KVM ioctl.

Signed-off-by: Huang Ying ying.hu...@intel.com

---
 kvm-all.c|   24 ++
 kvm.h|4 +++
 target-i386/helper.c |8 +-
 target-i386/kvm.c|   67 ++-
 4 files changed, 101 insertions(+), 2 deletions(-)

--- a/kvm-all.c
+++ b/kvm-all.c
@@ -765,6 +765,30 @@ int kvm_has_sync_mmu(void)
 return 0;
 }
 
+int kvm_has_mce(void)
+{
+#ifdef KVM_CAP_MCE
+KVMState *s = kvm_state;
+int r;
+
+r = kvm_ioctl(s, KVM_CHECK_EXTENSION, KVM_CAP_MCE);
+if (r  0)
+return r;
+#endif
+return 0;
+}
+
+int kvm_get_mce_cap_supported(uint64_t *mce_cap)
+{
+#ifdef KVM_CAP_MCE
+KVMState *s = kvm_state;
+
+return kvm_ioctl(s, KVM_X86_GET_MCE_CAP_SUPPORTED, mce_cap);
+#else
+return -ENOSYS;
+#endif
+}
+
 #ifdef KVM_CAP_SET_GUEST_DEBUG
 struct kvm_sw_breakpoint *kvm_find_sw_breakpoint(CPUState *env,
  target_ulong pc)
--- a/kvm.h
+++ b/kvm.h
@@ -47,6 +47,10 @@ int kvm_log_start(target_phys_addr_t phy
 int kvm_log_stop(target_phys_addr_t phys_addr, ram_addr_t size);
 
 int kvm_has_sync_mmu(void);
+int kvm_has_mce(void);
+int kvm_get_mce_cap_supported(uint64_t *mce_cap);
+void kvm_inject_x86_mce(CPUState *cenv, int bank, uint64_t status,
+uint64_t mcg_status, uint64_t addr, uint64_t misc);
 
 int kvm_coalesce_mmio_region(target_phys_addr_t start, ram_addr_t size);
 int kvm_uncoalesce_mmio_region(target_phys_addr_t start, ram_addr_t size);
--- a/target-i386/helper.c
+++ b/target-i386/helper.c
@@ -1440,6 +1440,11 @@ void cpu_inject_x86_mce(CPUState *cenv, 
 unsigned bank_num = mcg_cap  0xff;
 uint64_t *banks = cenv-mce_banks;
 
+if (kvm_enabled()) {
+kvm_inject_x86_mce(cenv, bank, status, mcg_status, addr, misc);
+return;
+}
+
 if (bank = bank_num || !(status  MCI_STATUS_VAL))
 return;
 
@@ -1757,7 +1762,8 @@ CPUX86State *cpu_x86_init(const char *cp
 cpu_x86_close(env);
 return NULL;
 }
-mce_init(env);
+if (!kvm_enabled())
+mce_init(env);
 cpu_reset(env);
 #ifdef CONFIG_KQEMU
 kqemu_init(env);
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -34,6 +34,42 @@
 do { } while (0)
 #endif
 
+static void kvm_arch_setup_mce(CPUState *env)
+{
+int banks;
+int ret;
+uint64_t mcg_cap;
+
+#ifdef KVM_CAP_MCE
+if (((env-cpuid_version  8)  0xf)  6)
+return;
+
+if ((env-cpuid_features  (CPUID_MCE|CPUID_MCA)) != (CPUID_MCE|CPUID_MCA))
+return;
+
+banks = kvm_has_mce();
+if (banks = 0)
+return;
+
+ret = kvm_get_mce_cap_supported(mcg_cap);
+if (ret) {
+fprintf(stderr, kvm_get_mce_cap_supported FAILED\n);
+return;
+}
+
+if (banks  MCE_BANKS_DEF)
+banks = MCE_BANKS_DEF;
+mcg_cap = MCE_CAP_DEF;
+mcg_cap |= banks;
+
+if (kvm_vcpu_ioctl(env, KVM_X86_SETUP_MCE, mcg_cap)) {
+fprintf(stderr, kvm: setup mce FAILED\n);
+return;
+}
+env-mcg_cap = mcg_cap;
+#endif
+}
+
 int kvm_arch_init_vcpu(CPUState *env)
 {
 struct {
@@ -42,6 +78,7 @@ int kvm_arch_init_vcpu(CPUState *env)
 } __attribute__((packed)) cpuid_data;
 uint32_t limit, i, j, cpuid_i;
 uint32_t unused;
+int ret;
 
 cpuid_i = 0;
 
@@ -107,7 +144,13 @@ int kvm_arch_init_vcpu(CPUState *env)
 
 cpuid_data.cpuid.nent = cpuid_i;
 
-return kvm_vcpu_ioctl(env, KVM_SET_CPUID2, cpuid_data);
+ret = kvm_vcpu_ioctl(env, KVM_SET_CPUID2, cpuid_data);
+if (ret  0)
+return ret;
+
+kvm_arch_setup_mce(env);
+
+return 0;
 }
 
 static int kvm_has_msr_star(CPUState *env)
@@ -665,6 +708,28 @@ int kvm_arch_handle_exit(CPUState *env, 
 return ret;
 }
 
+void kvm_inject_x86_mce(CPUState *cenv, int bank, uint64_t status,
+uint64_t mcg_status, uint64_t addr, uint64_t misc)
+{
+#ifdef KVM_CAP_MCE
+struct kvm_x86_mce mce = {
+.bank = bank,
+.status = status,
+.mcg_status = mcg_status,
+.addr = addr,
+.misc = misc,
+};
+int ret;
+
+if (kvm_has_mce() = 0)
+return;
+
+ret = kvm_vcpu_ioctl(cenv, KVM_X86_SET_MCE, mce);
+if (ret  0)
+fprintf(stderr, kvm: inject mce FAILED\n);
+#endif
+}
+
 #ifdef KVM_CAP_SET_GUEST_DEBUG
 int kvm_arch_insert_sw_breakpoint(CPUState *env, struct kvm_sw_breakpoint *bp)
 {



signature.asc
Description: This is a digitally signed message part


Re: KVM performance vs. Xen

2009-04-30 Thread Avi Kivity

Andrew Theurer wrote:

I wanted to share some performance data for KVM and Xen.  I thought it
would be interesting to share some performance results especially
compared to Xen, using a more complex situation like heterogeneous
server consolidation.

The Workload:
The workload is one that simulates a consolidation of servers on to a
single host.  There are 3 server types: web, imap, and app (j2ee).  In
addition, there are other helper servers which are also consolidated:
a db server, which helps out with the app server, and an nfs server,
which helps out with the web server (a portion of the docroot is nfs
mounted).  There is also one other server that is simply idle.  All 6
servers make up one set.  The first 3 server types are sent requests,
which in turn may send requests to the db and nfs helper servers.  The
request rate is throttled to produce a fixed amount of work.  In order
to increase utilization on the host, more sets of these servers are
used.  The clients which send requests also have a response time
requirement which is monitored.  The following results have passed the
response time requirements.



What's the typical I/O load (disk and network bandwidth) while the tests 
are running?



The host hardware:
A 2 socket, 8 core Nehalem with SMT, and EPT enabled, lots of disks, 4 x
1 GB Ethenret


CPU time measurements with SMT can vary wildly if the system is not 
fully loaded.  If the scheduler happens to schedule two threads on a 
single core, both of these threads will generate less work compared to 
if they were scheduled on different cores.




Test Results:
The throughput is equal in these tests, as the clients throttle the work
(this is assuming you don't run out of a resource on the host).  What's
telling is the CPU used to do the same amount of work:

Xen:  52.85%
KVM:  66.93%

So, KVM requires 66.93/52.85 = 26.6% more CPU to do the same amount of
work. Here's the breakdown:

totalusernice  system irq softirq   guest
66.907.200.00   12.940.353.39   43.02

Comparing guest time to all other busy time, that's a 23.88/43.02 = 55%
overhead for virtualization.  I certainly don't expect it to be 0, but
55% seems a bit high.  So, what's the reason for this overhead?  At the
bottom is oprofile output of top functions for KVM.  Some observations:

1) I'm seeing about 2.3% in scheduler functions [that I recognize].
Does that seems a bit excessive?


Yes, it is.  If there is a lot of I/O, this might be due to the thread 
pool used for I/O.



2) cpu_physical_memory_rw due to not using preadv/pwritev?


I think both virtio-net and virtio-blk use memcpy().


3) vmx_[save|load]_host_state: I take it this is from guest switches?


These are called when you context-switch from a guest, and, much more 
frequently, when you enter qemu.



We have 180,000 context switches a second.  Is this more than expected?



Way more.  Across 16 logical cpus, this is 10,000 cs/sec/cpu.


I wonder if schedstats can show why we context switch (need to let
someone else run, yielded, waiting on io, etc).



Yes, there is a scheduler tracer, though I have no idea how to operate it.

Do you have kvm_stat logs?

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Event channels in KVM?

2009-04-30 Thread Avi Kivity

Anthony Liguori wrote:

Kapadia, Vivek wrote:
I came across this thread looking for an efficient event channel 
mechanism between two guests (running on different cpu cores).


While I can use available emulated IO mechanism (guest1-host kernel 
driver-Qemu1-Qemu2) in conjunction with interrupt mechanism 
(Qemu2-host kernel driver-guest2) in KVM, this involves several 
context switches. Xen handles notifications in hypervisor via 
hypercall and hence is likely more efficient.
  


They almost certainly aren't more efficient.

An event channel notification involves a hypercall to the hypervisor.  
When using VT, the performance difference between a vmcall exit vs. a 
pio exit is quite small (especially compared to the overhead of the 
exit).  We're talking in the order of nanoseconds compared to 
microseconds.


What makes KVM particularly different from Xen is that in KVM, the PIO 
operation results in a direct transition to QEMU.  In Xen, typically 
event channel notifications result in a bit being set in a bitmap 
which then results in an interrupt injection depending on the next 
opportunity the hypervisor has to schedule/run the receiving domain.  
This is not deterministic and can potentially be a very long period of 
time.


Event channels are inherently asynchronous whereas PIO notifications 
in KVM are synchronous.  Since the scheduler isn't involved and 
control never leaves the CPU, the KVM PIO notifications are actually 
extremely efficient.  IMHO, it's one of KVM's best design features.




If you make the pio operation wake up another guest, then the operation 
becomes asynchronous.  There's really no fundamental different between 
Xen and kvm here, and both will require the same number of context 
switches (one) to transfer control.


Handling a pio that is completely internal to the guest is different 
(Xen has to schedule dom0 or the stub domain), but that's not related to 
interguest communications.




--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kvm-kmod.git via http

2009-04-30 Thread Avi Kivity

Bernhard Kohl wrote:

I'm trying to clone this new repository using the http protocol because I'm
behind a proxy. I get the following error. For kvm.git and qemu-kvm.git this
works well.

$ git clone http://www.kernel.org/pub/scm/virt/kvm/kvm-kmod.git
Initialized empty Git repository in /home/bernd/src/kvm-kmod/.git/
fatal: http://www.kernel.org/pub/scm/virt/kvm/kvm-kmod.git/info/refs not found:
did you run git update-server-info on the server?
$ 


Thank


Should work now.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/04] qemu-kvm: Remove the dependency for phys_ram_base for ipf.c

2009-04-30 Thread Jes Sorensen

Zhang, Xiantao wrote:

Jes Sorensen wrote:

The main difference is that my patch cleans up the interfaces and
calls to the various functions, and removes a bunch of global
variables as well. 


I still can't see the difference with the patch in Avi's tree except nvram 
stuff.  And I believe the global variable you mentioned should be only used for 
nvram. So I propose an incremental patch for that. :)


Hi Xiantao,

I cannot see your patch in Avi's tree, would you mind sending me the
latest version by email, so I can look into this?

Thanks,
Jes
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/04] qemu-kvm: Remove the dependency for phys_ram_base for ipf.c

2009-04-30 Thread Avi Kivity

Jes Sorensen wrote:

Zhang, Xiantao wrote:

Jes Sorensen wrote:

The main difference is that my patch cleans up the interfaces and
calls to the various functions, and removes a bunch of global
variables as well. 


I still can't see the difference with the patch in Avi's tree except 
nvram stuff.  And I believe the global variable you mentioned should 
be only used for nvram. So I propose an incremental patch for that. :)


Hi Xiantao,

I cannot see your patch in Avi's tree, would you mind sending me the
latest version by email, so I can look into this?



I pushed my queue into a branch (named 'queue').  Will merge once I 
resolve the regressions here.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 12/21] Remove odd hack in vga.c

2009-04-30 Thread Avi Kivity

Anthony Liguori wrote:

I looked closely at the vga code in kvm-userspace a while ago and merged
every fix I could understand into upstream QEMU.  This particular change makes
no sense to me.  I could not figure out from revision history what it actually
fixed.  I'm fairly certain it's not useful today.

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
---
 hw/vga.c |   27 ---
 1 files changed, 4 insertions(+), 23 deletions(-)

diff --git a/hw/vga.c b/hw/vga.c
index d96f1be..385184a 100644
--- a/hw/vga.c
+++ b/hw/vga.c
@@ -2227,33 +2227,14 @@ typedef struct PCIVGAState {
 VGAState vga_state;
 } PCIVGAState;
 
-static int s1, s2;

-
-static void mark_dirty(target_phys_addr_t start, target_phys_addr_t len)
-{
-target_phys_addr_t end = start + len;
-
-while (start  end) {
-cpu_physical_memory_set_dirty(cpu_get_physical_page_desc(start));
-start += TARGET_PAGE_SIZE;
-}
-}
-
 void vga_dirty_log_start(VGAState *s)
 {
 if (kvm_enabled()  s-map_addr)
-if (!s1) {
-kvm_log_start(s-map_addr, s-map_end - s-map_addr);
-mark_dirty(s-map_addr, s-map_end - s-map_addr);
-s1 = 1;
-}
+kvm_log_start(s-map_addr, s-map_end - s-map_addr);
+
 if (kvm_enabled()  s-lfb_vram_mapped) {
-if (!s2) {
-kvm_log_start(isa_mem_base + 0xa, 0x8000);
-kvm_log_start(isa_mem_base + 0xa8000, 0x8000);
-mark_dirty(isa_mem_base + 0xa, 0x1);
-}
-s2 = 1;
+kvm_log_start(isa_mem_base + 0xa, 0x8000);
+kvm_log_start(isa_mem_base + 0xa8000, 0x8000);
 }
 }
 
  


This makes live migration and vga dirty tracking work together.  
Unfortunately since the last merge with qemu it's broken.


We have a shared resource, the log_dirty flag of memory slots.  We can't 
call log_start() and log_stop() from different users and expect things 
to work.


One cleaner way to fix this is to add a parameter containing the mask 
which will be used by the client to access the qemu bytemap.  
log_start() can OR this parameter with its own copy, and log_stop() can 
AND NOT the same thing.  When the local copy is nonzero, the slot dirty 
log is enabled.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 13/21] Remove virtio-console PIF change

2009-04-30 Thread Avi Kivity

Anthony Liguori wrote:

If this change should happen, it should happen in upstream QEMU.

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
---
 hw/virtio-console.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/hw/virtio-console.c b/hw/virtio-console.c
index 89e8be0..b263281 100644
--- a/hw/virtio-console.c
+++ b/hw/virtio-console.c
@@ -132,7 +132,7 @@ void *virtio_console_init(PCIBus *bus, CharDriverState *chr)
  PCI_DEVICE_ID_VIRTIO_CONSOLE,
  PCI_VENDOR_ID_REDHAT_QUMRANET,
  VIRTIO_ID_CONSOLE,
- PCI_CLASS_OTHERS, 0x00,
+ PCI_CLASS_DISPLAY_OTHER, 0x00,
  0, sizeof(VirtIOConsole));
 if (s == NULL)
 return NULL;
  


Since virtio-console is not enabled by default, it isn't needed, so I'll 
apply this.


But if it were needed, there's no reason to introduce regressions into 
qemu-kvm.git.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 14/21] Remove -cpu-vendor-string

2009-04-30 Thread Avi Kivity

Anthony Liguori wrote:

This isn't in upstream QEMU and is of little utility to KVM.  It's unlikely
to appear in upstream QEMU either.
  


Since we allow overriding cpuid flags, why not the vendor string?  It's 
necessary for cpu passthrough.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 18/21] Remove host_alarm_timer hacks.

2009-04-30 Thread Avi Kivity

Anthony Liguori wrote:

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
---
 vl.c |3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/vl.c b/vl.c
index 3b0e3dc..848a8f8 100644
--- a/vl.c
+++ b/vl.c
@@ -1367,8 +1367,7 @@ static void host_alarm_handler(int host_signum)
 last_clock = ti;
 }
 #endif
-if (1 ||
-alarm_has_dynticks(alarm_timer) ||
+if (alarm_has_dynticks(alarm_timer) ||
 (!use_icount 
 qemu_timer_expired(active_timers[QEMU_TIMER_VIRTUAL],
qemu_get_clock(vm_clock))) ||
  


This was added to fix a problem.  Have you tested it?

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/21] Remove merge artifacts from qemu-kvm

2009-04-30 Thread Avi Kivity

Anthony Liguori wrote:

Now that we've got qemu-kvm, it's pretty easy to look at what's different
between upstream QEMU and qemu-kvm.  


This was actually easy in kvm-userspace.git: git diff origin/master 
origin/qemu-svn/trunk.



Unfortunately, there's still a lot of
gunk that seems to keep surviving merges.

This series removes all of the gunk I could find.  I also culled out a number
of fixes that should be in upstream QEMU.  I'll take care of getting those
committed.
  


Applied all except patches for which I had objections (noted in separate 
replies); thanks.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 17/21] Remove #define __user in usb-linux.c

2009-04-30 Thread Avi Kivity

Anthony Liguori wrote:

This has been consistently nacked in upstream QEMU.

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
---
 usb-linux.c |4 
 1 files changed, 0 insertions(+), 4 deletions(-)

diff --git a/usb-linux.c b/usb-linux.c
index 26643bd..70d7a1c 100644
--- a/usb-linux.c
+++ b/usb-linux.c
@@ -34,10 +34,6 @@
 #include qemu-timer.h
 #include monitor.h
 
-#if defined(__linux__)

-#define __user
-#endif
-
  


This will introduce a regression into qemu-kvm.git.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: qemu-kvm.git now live

2009-04-30 Thread Avi Kivity

Jan Kiszka wrote:

That's sort of what's implemented in qemu-kvm.git.  In qemu.git vga
logging does not get disabled, which is really broken.  It prevents
optimizations like disabling logging when the screen is not displayed to
a human.



Is there a channel that tells vga nothing will be displayed? I may
have missed it while removing all those disable-logging-as-it-may-
confuse-slot-management hooks.
  


I think currently qemu simply stops calling vga_draw_graphic().  This 
makes sense for tcg since it needs to track dirty memory regardless (so 
it can invalidate TBs).  But for kvm we'll want to add an explicit channel.


Note that it isn't likely to make a huge difference: if you don't 
actively read-and-reset the dirty bitmap, kvm will keep the shadow ptes 
with write permission and you won't see any performance hit.  The only 
difference is whether large pages can be used or not.



Where/how does the
migration code disable dirty logging?
  
  

Should be phase 3 of ram_save_live().



But only in qemu-kvm. What is the plan about pushing it upstream? Then
we could discuss how to extend the exiting support best.
  


Pushing things upstream is quite difficult because of the very different 
infrastructure.  It's unfortunate that upstream rewrote everything 
instead of changing things incrementally.  Rewrites are almost always a 
mistake since they throw away accumulated knowledge.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] qemu-kvm: Remove duplicate set_link monitor command

2009-04-30 Thread Avi Kivity

Jan Kiszka wrote:

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---

 monitor.c |1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/monitor.c b/monitor.c
index 11e48c7..674630b 100644
--- a/monitor.c
+++ b/monitor.c
@@ -1792,7 +1792,6 @@ static const mon_cmd_t mon_cmds[] = {
acl allow vnc.username fred\n
acl deny vnc.username bob\n
acl reset vnc.username\n },
-{ set_link, ss, do_set_link, name [up|down] },
 { cpu_set, is, do_cpu_set_nr, cpu [online|offline], change cpu 
state },
 { NULL, NULL, },
 };

  


Applied, thanks.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: external module: fix request_irq for 2.6.19

2009-04-30 Thread Avi Kivity

Chris Wright wrote:

The irq handler changes (introduced in 2.6.19, not 2.6.20) dropped
struct pt_regs from the handler prototype, they are found globally now.
This introduces the back compat for older kernels.  The handler is just
a thin layer which calls the real registered handler (all this to work
around a minor little compiler warning ;-)  Needed for device assignment
on older kernels.
  


Applied, thanks.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm-kmod: fix build on kernels with kvm trace set

2009-04-30 Thread Avi Kivity

Michael S. Tsirkin wrote:

CONFIG_KVM_TRACE in kernel conflicts with the definition
in external module. external-module-compat-comm.h tried
to work around this, but this didn't work as some
code still does #include linux/autoconf.h
directly.

Solve this differently by s/CONFIG_KVM_TRACE/CONFIG_KMOD_KVM_TRACE/
in awk. Had to tighten regular expressions in hack-module.awk
so that they don't trigger on kvm_host.h .

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 Makefile  |5 +++--
 configure |2 +-
 external-module-compat-comm.h |7 ---
 x86/Kbuild|2 +-
 x86/hack-module.awk   |8 +---
 5 files changed, 10 insertions(+), 14 deletions(-)

diff --git a/Makefile b/Makefile
index f2ef811..9cdc0af 100644
--- a/Makefile
+++ b/Makefile
@@ -34,8 +34,8 @@ hack-files-ia64 = kvm_main.c kvm_fw.c kvm_lib.c kvm-ia64.c
 
 hack-files = $(hack-files-$(ARCH_DIR))
 
-ifeq ($(EXT_CONFIG_KVM_TRACE),y)

-module_defines += -DEXT_CONFIG_KVM_TRACE=y
+ifeq ($(CONFIG_KMOD_KVM_TRACE),y)
+module_defines += -DCONFIG_KMOD_KVM_TRACE=1
 endif
 
 all:: prerequisite

@@ -72,6 +72,7 @@ header-sync:
for i in $$(find $T -name '*.h'); do \
$(call unifdef,$$i); done
$(call hack, include/linux/kvm.h)
+   $(call hack, include/linux/kvm_host.h)
$(call hack, include/asm-$(ARCH_DIR)/kvm.h)
set -e  for i in $$(find $T -type f -printf '%P '); \
do mkdir -p $$(dirname $$i); cmp -s $$i $T/$$i || cp $T/$$i 
$$i; done
diff --git a/configure b/configure
index 30af6e7..6e12bb1 100755
--- a/configure
+++ b/configure
@@ -122,5 +122,5 @@ DEPMOD_VERSION=$depmod_version
 EOF
 
 cat EOF  config.kbuild

-EXT_CONFIG_KVM_TRACE=$kvm_trace
+CONFIG_KMOD_KVM_TRACE=$kvm_trace
 EOF
diff --git a/external-module-compat-comm.h b/external-module-compat-comm.h
index c955927..e561448 100644
--- a/external-module-compat-comm.h
+++ b/external-module-compat-comm.h
@@ -18,13 +18,6 @@
 #include linux/hrtimer.h
 #include asm/bitops.h
 
-/* Override CONFIG_KVM_TRACE */

-#ifdef EXT_CONFIG_KVM_TRACE
-#  define CONFIG_KVM_TRACE 1
-#else
-#  undef CONFIG_KVM_TRACE
-#endif
-
 /*
  * 2.6.16 does not have GFP_NOWAIT
  */
diff --git a/x86/Kbuild b/x86/Kbuild
index d3aca00..fbdb28b 100644
--- a/x86/Kbuild
+++ b/x86/Kbuild
@@ -7,7 +7,7 @@ kvm-objs := kvm_main.o x86.o mmu.o x86_emulate.o 
../anon_inodes.o irq.o i8259.o
 lapic.o ioapic.o preempt.o i8254.o coalesced_mmio.o irq_comm.o \
 timer.o \
 ../external-module-compat.o
-ifeq ($(EXT_CONFIG_KVM_TRACE),y)
+ifeq ($(CONFIG_KMOD_KVM_TRACE),y)
 kvm-objs += kvm_trace.o
 endif
 ifeq ($(CONFIG_IOMMU_API),y)
diff --git a/x86/hack-module.awk b/x86/hack-module.awk
index 260eeef..f3d95be 100644
--- a/x86/hack-module.awk
+++ b/x86/hack-module.awk
@@ -4,7 +4,7 @@ BEGIN { split(INIT_WORK desc_struct ldttss_desc64 desc_ptr  \
  hrtimer_expires_remaining  \
  on_each_cpu relay_open request_irq , compat_apis); }
 
-/^int kvm_init\(/ { anon_inodes = 1 }

+/^int kvm_init\([^)]*\)$/ { anon_inodes = 1 }
 
 /return 0;/  anon_inodes {

 print \tr = kvm_init_anon_inodes();;
@@ -17,7 +17,7 @@ BEGIN { split(INIT_WORK desc_struct ldttss_desc64 desc_ptr  
\
 anon_inodes = 0
 }
 
-/^void kvm_exit/ { anon_inodes_exit = 1 }

+/^void kvm_exit[^)]*\)$/ { anon_inodes_exit = 1 }
 
 /\}/  anon_inodes_exit {

 print \tkvm_exit_anon_inodes();;
@@ -25,7 +25,7 @@ BEGIN { split(INIT_WORK desc_struct ldttss_desc64 desc_ptr  
\
 anon_inodes_exit = 0
 }
 
-/^int kvm_arch_init/ { kvm_arch_init = 1 }

+/^int kvm_arch_init[^)])$/ { kvm_arch_init = 1 }
 /\tsc_khz\/  kvm_arch_init { sub(\\tsc_khz\\, kvm_tsc_khz) }
 /^}/ { kvm_arch_init = 0 }
 
@@ -85,6 +85,8 @@ BEGIN { split(INIT_WORK desc_struct ldttss_desc64 desc_ptr  \
 
 /\kvm_.*_fops\.owner = module;/ { $0 = IF_ANON_INODES_DOES_REFCOUNTS( $0 ) }
 
+{ sub(/\CONFIG_KVM_TRACE\/, CONFIG_KMOD_KVM_TRACE) }

+
 { print }
 
 /unsigned long flags;/   vmx_load_host_state {
  


Xiantao, do we need to change this for ia64?

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: qemu-kvm.git now live

2009-04-30 Thread Avi Kivity

Jan Kiszka wrote:

Avi Kivity wrote:
  

Where/how does the
migration code disable dirty logging?

  

Should be phase 3 of ram_save_live().



But only in qemu-kvm. What is the plan about pushing it upstream? Then
we could discuss how to extend the exiting support best.
  
  

Pushing things upstream is quite difficult because of the very different
infrastructure.



Isn't the midterm goal to get rid of most of these differences (namely
libkvm)?
  


Yes, but not by removing existing functionality.

  

It's unfortunate that upstream rewrote everything
instead of changing things incrementally.  Rewrites are almost always a
mistake since they throw away accumulated knowledge.



I disagree, at least in this particular case. Upstream already diverged
from qemu-kvm, and the latter provided no comparable alternative for
slot management and dirty logging. And I still don't see that we lost
anything that could not easily be re-integrated into upstream (ie.
global dirty logging), finally leading to a cleaner and more complete
result.
  


It could have been done differently, by morphing the existing support 
into something mergable, and merging that.  In this way, we'd ensure no 
needed functionality is lost.


As is, we're adding something simple, then discovering it's 
insufficient.  We're throwing away information, that's not a good way to 
make progress.



So, what bits are missing to make KVM migration work in upstream?
  


I don't know of anything beyond dirty logging.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 16/21] Remove clean rule change

2009-04-30 Thread Avi Kivity

Anthony Liguori wrote:

It's not in upstream QEMU so apparently it's not useful.

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
---
 pc-bios/Makefile |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/pc-bios/Makefile b/pc-bios/Makefile
index dabeb4c..315288d 100644
--- a/pc-bios/Makefile
+++ b/pc-bios/Makefile
@@ -16,4 +16,4 @@ all: $(TARGETS)
dtc -I dts -O dtb -o $@ $
 
 clean:

-   rm -f $(TARGETS) *.o *~ *.dtb
+   rm -f $(TARGETS) *.o *~
  


Hollis?

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 11/21] Remove unused variables in vga.c

2009-04-30 Thread Avi Kivity

Anthony Liguori wrote:

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
---
 hw/vga.c |5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/hw/vga.c b/hw/vga.c
index 4931b69..d96f1be 100644
--- a/hw/vga.c
+++ b/hw/vga.c
@@ -1585,12 +1585,11 @@ static void vga_sync_dirty_bitmap(VGAState *s)
  */
 static void vga_draw_graphic(VGAState *s, int full_update)
 {
-int y1, y, update, linesize, y_start, double_scan, mask, depth;
-int width, height, shift_control, line_offset, bwidth, bits;
+int y1, y, update, page_min, page_max, linesize, y_start, double_scan, 
mask, depth;
+int width, height, shift_control, line_offset, page0, page1, bwidth, bits;
 int disp_width, multi_scan, multi_run;
 uint8_t *d;
 uint32_t v, addr1, addr;
-long page0, page1, page_min, page_max;
 vga_draw_line_func *vga_draw_line;
 


This introduces a regression with 4GB guests.  I resolved this by 
posting a patch to qemu; see 12c7e75a7c.  Are you using an outdated 
checkout?


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: qemu-kvm.git now live

2009-04-30 Thread Jan Kiszka
Avi Kivity wrote:
 Where/how does the
 migration code disable dirty logging?
 
 Should be phase 3 of ram_save_live().
 

 But only in qemu-kvm. What is the plan about pushing it upstream? Then
 we could discuss how to extend the exiting support best.
   
 
 Pushing things upstream is quite difficult because of the very different
 infrastructure.

Isn't the midterm goal to get rid of most of these differences (namely
libkvm)?

 It's unfortunate that upstream rewrote everything
 instead of changing things incrementally.  Rewrites are almost always a
 mistake since they throw away accumulated knowledge.

I disagree, at least in this particular case. Upstream already diverged
from qemu-kvm, and the latter provided no comparable alternative for
slot management and dirty logging. And I still don't see that we lost
anything that could not easily be re-integrated into upstream (ie.
global dirty logging), finally leading to a cleaner and more complete
result.

So, what bits are missing to make KVM migration work in upstream?

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] KVM: Replace get_mt_mask_shift with get_mt_mask

2009-04-30 Thread Avi Kivity

Sheng Yang wrote:

Shadow_mt_mask is out of date, now it have only been used as a flag to indicate
if TDP enabled. Get rid of it and use tdp_enabled instead.

Also put memory type logical in kvm_x86_ops-get_mt_mask().
  


Applied both, thanks.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[KVM-AUTOTEST] [PATCH] support for remote migration

2009-04-30 Thread yogi
Hello everyone,

I like to submit patch to add support for remote migration in
kvm-autotest.

To use this patch the following four parameters should be added to the
existing migration test

remote = dst
hostip = localhost ip or name
remoteip = remote host ip or name
remuser = root
rempassword = password

the field remote=dst indicates the VM dst should be created on remote
machine.
For example:
- migrate:  install setup
type = migration
vms +=  dst
migration_test_command = help
kill_vm_on_error = yes
remote = dst
hostip = 192.168.1.2
remoteip = 192.168.1.3
remuser = root
rempassword = 123456
variants:

Three files r being modified in this patch kvm_utils.py, kvm_tests.py
and kvm_vm.py.
kvm_utils.py - if the  ssh-keys have been exchanged between the test
machines,then remote login fails with message Got unexpected login
prompt, to prevent this, have made it return a session rather then None

kvm_tests.py - the host address used in migration is made dynamic

kvm_vm.py -have replaced unix sockets with tcp sockets for monitor,
in both remote and local VM. Added two new variables(remote,ssh_port) to
class VM,remote set to True if the VM is on a  remote machine,ssh_port
contains the redirection port, funtion get_address() returns the ip of
the host whr the VM is(local or remote).

Thx
Yogi

 kvm_tests.py |2 -
 kvm_utils.py |3 --
 kvm_vm.py|   61 ---
 3 files changed, 48 insertions(+), 18 deletions(-)

Signed-off-by: Yogananth Subramanian anant...@in.ibm.com
---
diff -aurp kvm-autotest.orgi//client/tests/kvm_runtest_2/kvm_tests.py kvm-autotest/client/tests/kvm_runtest_2/kvm_tests.py
--- kvm-autotest.orgi//client/tests/kvm_runtest_2/kvm_tests.py	2009-04-29 18:33:10.0 +
+++ kvm-autotest/client/tests/kvm_runtest_2/kvm_tests.py	2009-04-30 05:59:24.0 +
@@ -81,7 +81,7 @@ def run_migration(test, params, env):
 session.close()
 
 # Define the migration command
-cmd = migrate -d tcp:localhost:%d % dest_vm.migration_port
+cmd = migrate -d tcp:%s:%d % (dest_vm.hostip,dest_vm.migration_port)
 kvm_log.debug(Migration command: %s % cmd)
 
 # Migrate
diff -aurp kvm-autotest.orgi//client/tests/kvm_runtest_2/kvm_utils.py kvm-autotest/client/tests/kvm_runtest_2/kvm_utils.py
--- kvm-autotest.orgi//client/tests/kvm_runtest_2/kvm_utils.py	2009-04-29 18:33:10.0 +
+++ kvm-autotest/client/tests/kvm_runtest_2/kvm_utils.py	2009-04-30 06:13:47.0 +
@@ -431,8 +431,7 @@ def remote_login(command, password, prom
 return None
 elif match == 2:  # login:
 kvm_log.debug(Got unexpected login prompt)
-sub.close()
-return None
+return sub
 elif match == 3:  # Connection closed
 kvm_log.debug(Got 'Connection closed')
 sub.close()
diff -aurp kvm-autotest.orgi//client/tests/kvm_runtest_2/kvm_vm.py kvm-autotest/client/tests/kvm_runtest_2/kvm_vm.py
--- kvm-autotest.orgi//client/tests/kvm_runtest_2/kvm_vm.py	2009-04-29 18:33:10.0 +
+++ kvm-autotest/client/tests/kvm_runtest_2/kvm_vm.py	2009-04-30 06:31:34.0 +
@@ -3,6 +3,7 @@
 import time
 import socket
 import os
+import re
 
 import kvm_utils
 import kvm_log
@@ -105,6 +106,7 @@ class VM:
 self.qemu_path = qemu_path
 self.image_dir = image_dir
 self.iso_dir = iso_dir
+self.remote = False
 
 def verify_process_identity(self):
 Make sure .pid really points to the original qemu process.
@@ -124,8 +126,6 @@ class VM:
 file.close()
 if not self.qemu_path in cmdline:
 return False
-if not self.monitor_file_name in cmdline:
-return False
 return True
 
 def make_qemu_command(self, name=None, params=None, qemu_path=None, image_dir=None, iso_dir=None):
@@ -173,7 +173,6 @@ class VM:
 
 qemu_cmd = qemu_path
 qemu_cmd +=  -name '%s' % name
-qemu_cmd +=  -monitor unix:%s,server,nowait % self.monitor_file_name
 
 for image_name in kvm_utils.get_sub_dict_names(params, images):
 image_params = kvm_utils.get_sub_dict(params, image_name)
@@ -211,6 +210,7 @@ class VM:
 redir_params = kvm_utils.get_sub_dict(params, redir_name)
 guest_port = int(redir_params.get(guest_port))
 host_port = self.get_port(guest_port)
+self.ssh_port = host_port
 qemu_cmd +=  -redir tcp:%s::%s % (host_port, guest_port)
 
 if params.get(display) == vnc:
@@ -254,6 +254,17 @@ class VM:
 image_dir = self.image_dir
 iso_dir = self.iso_dir
 
+# If VM is remote, set hostip to ip of the remote machine
+# If VM is local set hostip to localhost or hostip param
+if params.get(remote) == self.name:
+

Re: [PATCH] KVM: VMX: Disable VMX when system shutdown

2009-04-30 Thread Avi Kivity

Sheng Yang wrote:

Intel TXT(Trusted Execution Technology) required VMX off for all cpu to work
when system shutdown.
  


Applied, thanks.

Is this needed for 2.6.30 and -stable?  That is, is the code that 
enables TXT in 2.6.30 and below or in the BIOS?  Or is it new code not 
yet merged?




--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM performance vs. Xen

2009-04-30 Thread Andrew Theurer

Avi Kivity wrote:

Andrew Theurer wrote:

I wanted to share some performance data for KVM and Xen.  I thought it
would be interesting to share some performance results especially
compared to Xen, using a more complex situation like heterogeneous
server consolidation.

The Workload:
The workload is one that simulates a consolidation of servers on to a
single host.  There are 3 server types: web, imap, and app (j2ee).  In
addition, there are other helper servers which are also consolidated:
a db server, which helps out with the app server, and an nfs server,
which helps out with the web server (a portion of the docroot is nfs
mounted).  There is also one other server that is simply idle.  All 6
servers make up one set.  The first 3 server types are sent requests,
which in turn may send requests to the db and nfs helper servers.  The
request rate is throttled to produce a fixed amount of work.  In order
to increase utilization on the host, more sets of these servers are
used.  The clients which send requests also have a response time
requirement which is monitored.  The following results have passed the
response time requirements.



What's the typical I/O load (disk and network bandwidth) while the 
tests are running?

This is average thrgoughput:
network:Tx: 79 MB/sec  Rx: 5 MB/sec
disk:read: 17 MB/sec  write: 40 MB/sec



The host hardware:
A 2 socket, 8 core Nehalem with SMT, and EPT enabled, lots of disks, 4 x
1 GB Ethenret


CPU time measurements with SMT can vary wildly if the system is not 
fully loaded.  If the scheduler happens to schedule two threads on a 
single core, both of these threads will generate less work compared to 
if they were scheduled on different cores.
Understood.  Even if at low loads, the scheduler does the right thing 
and spreads out to all the cores first, once it goes beyond 50% util, 
the CPU util can climb at a much higher rate (compared to a linear 
increase in work) because it then starts scheduling 2 threads per core, 
and each thread can do less work.  I have always wanted something which 
could more accurately show the utilization of a processor core, but I 
guess we have to use what we have today.  I will run again with SMT off 
to see what we get.




Test Results:
The throughput is equal in these tests, as the clients throttle the work
(this is assuming you don't run out of a resource on the host).  What's
telling is the CPU used to do the same amount of work:

Xen:  52.85%
KVM:  66.93%

So, KVM requires 66.93/52.85 = 26.6% more CPU to do the same amount of
work. Here's the breakdown:

totalusernice  system irq softirq   guest
66.907.200.00   12.940.353.39   43.02

Comparing guest time to all other busy time, that's a 23.88/43.02 = 55%
overhead for virtualization.  I certainly don't expect it to be 0, but
55% seems a bit high.  So, what's the reason for this overhead?  At the
bottom is oprofile output of top functions for KVM.  Some observations:

1) I'm seeing about 2.3% in scheduler functions [that I recognize].
Does that seems a bit excessive?


Yes, it is.  If there is a lot of I/O, this might be due to the thread 
pool used for I/O.
I have a older patch which makes a small change to posix_aio_thread.c by 
trying to keep the thread pool size a bit lower than it is today.  I 
will dust that off and see if it helps.



2) cpu_physical_memory_rw due to not using preadv/pwritev?


I think both virtio-net and virtio-blk use memcpy().


3) vmx_[save|load]_host_state: I take it this is from guest switches?


These are called when you context-switch from a guest, and, much more 
frequently, when you enter qemu.



We have 180,000 context switches a second.  Is this more than expected?



Way more.  Across 16 logical cpus, this is 10,000 cs/sec/cpu.


I wonder if schedstats can show why we context switch (need to let
someone else run, yielded, waiting on io, etc).



Yes, there is a scheduler tracer, though I have no idea how to operate 
it.


Do you have kvm_stat logs?
Sorry, I don't, but I'll run that next time.  BTW, I did not notice a 
batch/log mode the last time I ram kvm_stat.  Or maybe it was not 
obvious to me.  Is there an ideal way to run kvm_stat without a curses 
like output?


-Andrew


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] deal with interrupt shadow state for emulated instruction

2009-04-30 Thread Avi Kivity

Glauber Costa wrote:

we currently unblock shadow interrupt state when we skip an instruction,
but failing to do so when we actually emulate one. This blocks interrupts
in key instruction blocks, in particular sti; hlt; sequences

If the instruction emulated is an sti, we have to block shadow interrupts.
The same goes for mov ss. pop ss also needs it, but we don't currently
emulate it.

Without this patch, I cannot boot gpxe option roms at vmx machines.
This is described at https://bugzilla.redhat.com/show_bug.cgi?id=494469

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index cb306cf..9455a30 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -510,6 +510,8 @@ struct kvm_x86_ops {
void (*run)(struct kvm_vcpu *vcpu, struct kvm_run *run);
int (*handle_exit)(struct kvm_run *run, struct kvm_vcpu *vcpu);
void (*skip_emulated_instruction)(struct kvm_vcpu *vcpu);
+   void (*interrupt_shadow_mask)(struct kvm_vcpu *vcpu, int mask);
  


Can you verb this function?  set_interrupt_shadow would make it nicely 
complement get_interrupt_shadow.



+   u32 (*get_interrupt_shadow)(struct kvm_vcpu *vcpu);
void (*patch_hypercall)(struct kvm_vcpu *vcpu,
unsigned char *hypercall_addr);
int (*get_irq)(struct kvm_vcpu *vcpu);
 
+static u32 svm_get_interrupt_shadow(struct kvm_vcpu *vcpu)

+{
+   struct vcpu_svm *svm = to_svm(vcpu);
+   u32 ret = 0;
+
+   if (svm-vmcb-control.int_state  SVM_INTERRUPT_SHADOW_MASK)
+   ret |= (X86_SHADOW_INT_STI  X86_SHADOW_INT_MOV_SS);
+   return ret;
+}
  


Hmm, if the guest runs an infinite emulated 'mov ss', it will keep 
toggling the MOV_SS bit, but STI will remain set, so we'll never allow 
an interrupt into the guest kernel.



diff --git a/arch/x86/kvm/x86_emulate.c b/arch/x86/kvm/x86_emulate.c
index d2664fc..797d41f 100644
--- a/arch/x86/kvm/x86_emulate.c
+++ b/arch/x86/kvm/x86_emulate.c
@@ -1618,6 +1618,16 @@ special_insn:
int err;
 
 		sel = c-src.val;

+   if (c-modrm_reg == VCPU_SREG_SS) {
+   u32 int_shadow =
+   kvm_x86_ops-get_interrupt_shadow(ctxt-vcpu);
+   /* See sti emulation for an explanation of this */
+   if ((int_shadow  X86_SHADOW_INT_MOV_SS))
+   ctxt-interruptibility = 
~X86_SHADOW_INT_MOV_SS;
+   else
+   ctxt-interruptibility |= X86_SHADOW_INT_MOV_SS;
+   }
  


^=


@@ -1846,10 +1856,23 @@ special_insn:
ctxt-eflags = ~X86_EFLAGS_IF;
c-dst.type = OP_NONE;   /* Disable writeback. */
break;
-   case 0xfb: /* sti */
+   case 0xfb: { /* sti */
+   u32 int_shadow = kvm_x86_ops-get_interrupt_shadow(ctxt-vcpu);
+   /*
+* an sti; sti; sequence only disable interrupts for the first
+* instruction. So, if the last instruction, be it emulated or
+* not, left the system with the INT_STI flag enabled, it
+* means that the last instruction is an sti. We should not
+* leave the flag on in this case
+*/
+   if ((int_shadow  X86_SHADOW_INT_STI))
+   ctxt-interruptibility = ~X86_SHADOW_INT_STI;
+   else
+   ctxt-interruptibility |= X86_SHADOW_INT_STI;
  


^=

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/21] Remove use of signalfd in block-raw-posix.c

2009-04-30 Thread Avi Kivity

Anthony Liguori wrote:

We don't use signalfd in upstream QEMU.  Instead, we always emulate it.
  


With an extra thread - so an extra context switch.


It's not necessarily a bad thing to use signalfd, but this is something that
should be done upstream.  It certainly does qemu-kvm no harm to use the upstream
code.
  


It will introduce a (likely minor, but real) performance regression.

Instead of this, why not apply the reverse patch to qemu.git?


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/21] Remove use of signalfd in block-raw-posix.c

2009-04-30 Thread Anthony Liguori

Avi Kivity wrote:

Anthony Liguori wrote:

We don't use signalfd in upstream QEMU.  Instead, we always emulate it.
  


With an extra thread - so an extra context switch.


We don't use an extra thread.  We just install a signal handler that 
writes to a pipe.  At best, the added overhead is that we get EINTRs 
more often but this is something we already handle.


It's not necessarily a bad thing to use signalfd, but this is 
something that
should be done upstream.  It certainly does qemu-kvm no harm to use 
the upstream

code.
  


It will introduce a (likely minor, but real) performance regression.

Instead of this, why not apply the reverse patch to qemu.git?


I'm not sure signalfd really buys us much.  To emulate it requires 
writing a bunch more data to the pipe.  When writing more than 1 byte, 
we have to worry about whether there's a partial write because the pipe 
buffers full).  We also have to make sure to read from the fd in 
properly sized chunks.


Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM performance vs. Xen

2009-04-30 Thread Avi Kivity

Andrew Theurer wrote:

Avi Kivity wrote:




What's the typical I/O load (disk and network bandwidth) while the 
tests are running?

This is average thrgoughput:
network:Tx: 79 MB/sec  Rx: 5 MB/sec


MB as in Byte or Mb as in bit?


disk:read: 17 MB/sec  write: 40 MB/sec


This could definitely cause the extra load, especially if it's many 
small requests (compared to a few large ones).



The host hardware:
A 2 socket, 8 core Nehalem with SMT, and EPT enabled, lots of disks, 
4 x

1 GB Ethenret


CPU time measurements with SMT can vary wildly if the system is not 
fully loaded.  If the scheduler happens to schedule two threads on a 
single core, both of these threads will generate less work compared 
to if they were scheduled on different cores.
Understood.  Even if at low loads, the scheduler does the right thing 
and spreads out to all the cores first, once it goes beyond 50% util, 
the CPU util can climb at a much higher rate (compared to a linear 
increase in work) because it then starts scheduling 2 threads per 
core, and each thread can do less work.  I have always wanted 
something which could more accurately show the utilization of a 
processor core, but I guess we have to use what we have today.  I will 
run again with SMT off to see what we get.


On the other hand, without SMT you will get to overcommit much faster, 
so you'll have scheduling artifacts.  Unfortunately there's no good 
answer here (except to improve the SMT scheduler).


Yes, it is.  If there is a lot of I/O, this might be due to the 
thread pool used for I/O.
I have a older patch which makes a small change to posix_aio_thread.c 
by trying to keep the thread pool size a bit lower than it is today.  
I will dust that off and see if it helps.


Really, I think linux-aio support can help here.



Yes, there is a scheduler tracer, though I have no idea how to 
operate it.


Do you have kvm_stat logs?
Sorry, I don't, but I'll run that next time.  BTW, I did not notice a 
batch/log mode the last time I ram kvm_stat.  Or maybe it was not 
obvious to me.  Is there an ideal way to run kvm_stat without a curses 
like output?


You're probably using an ancient version:

$ kvm_stat --help
Usage: kvm_stat [options]

Options:
 -h, --helpshow this help message and exit
 -1, --once, --batch   run in batch mode for one second
 -l, --log run in logging mode (like vmstat)
 -f FIELDS, --fields=FIELDS
   fields to display (regex)


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/21] Remove use of signalfd in block-raw-posix.c

2009-04-30 Thread Avi Kivity

Anthony Liguori wrote:

Avi Kivity wrote:

Anthony Liguori wrote:

We don't use signalfd in upstream QEMU.  Instead, we always emulate it.
  


With an extra thread - so an extra context switch.


We don't use an extra thread.  We just install a signal handler that 
writes to a pipe.  At best, the added overhead is that we get EINTRs 
more often but this is something we already handle.


Oh okay.  But signal delivery is slow; for example the FPU needs to be 
reset.


I'm not sure signalfd really buys us much.  To emulate it requires 
writing a bunch more data to the pipe.  When writing more than 1 byte, 
we have to worry about whether there's a partial write because the 
pipe buffers full).  We also have to make sure to read from the fd in 
properly sized chunks.


Then we can use one byte writes (and reads) when signalfd is not 
available.  128 byte pipe read/writes should always be atomic on Linux 
though, likely on other OSes too.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [KVM PATCH v3 2/2] kvm: add support for irqfd via eventfd-notification interface

2009-04-30 Thread Michael S. Tsirkin
On Mon, Apr 27, 2009 at 02:33:34PM -0400, Gregory Haskins wrote:
 This allows an eventfd to be registered as an irq source with a guest.  Any
 signaling operation on the eventfd (via userspace or kernel) will inject
 the registered GSI at the next available window.
 
 Signed-off-by: Gregory Haskins ghask...@novell.com

If we ever want to use this with e.g. MSI-X emulation in guest, and want
to be stricly compliant to MSI-X, we'll need a way for guest to mask
interrupts, and for host to report that a masked interrupt is pending.
Ideally, all this will be doable with a couple of mmapped pages to avoid
vmexits/system calls.

 +static void
 +irqfd_inject(struct work_struct *work)
 +{
 + struct _irqfd *irqfd = container_of(work, struct _irqfd, work);
 + struct kvm *kvm = irqfd-kvm;
 +
 + mutex_lock(kvm-lock);
 + kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID, irqfd-gsi, 1);
 + kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID, irqfd-gsi, 0);
 + mutex_unlock(kvm-lock);

This will do weird stuff (deliver the irq twice) if the irq is
MSI/MSI-X. I know this was discussed already and is a temporary
shortcut, but maybe add a comment that we really want kvm_toggle_irq,
so that we won't forget?

 +}
 +

-- 
MST
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 11/21] Remove unused variables in vga.c

2009-04-30 Thread Anthony Liguori

Avi Kivity wrote:

@@ -1585,12 +1585,11 @@ static void vga_sync_dirty_bitmap(VGAState *s)
  */
 static void vga_draw_graphic(VGAState *s, int full_update)
 {
-int y1, y, update, linesize, y_start, double_scan, mask, depth;
-int width, height, shift_control, line_offset, bwidth, bits;
+int y1, y, update, page_min, page_max, linesize, y_start, 
double_scan, mask, depth;
+int width, height, shift_control, line_offset, page0, page1, 
bwidth, bits;

 int disp_width, multi_scan, multi_run;
 uint8_t *d;
 uint32_t v, addr1, addr;
-long page0, page1, page_min, page_max;
 vga_draw_line_func *vga_draw_line;
 


This introduces a regression with 4GB guests.  I resolved this by 
posting a patch to qemu; see 12c7e75a7c.  Are you using an outdated 
checkout?


Oh, I understand what's happening now.  It took me a while to see that 
we're changing the type of variables from int to long.


--
Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 14/21] Remove -cpu-vendor-string

2009-04-30 Thread Anthony Liguori

Avi Kivity wrote:

Anthony Liguori wrote:
This isn't in upstream QEMU and is of little utility to KVM.  It's 
unlikely

to appear in upstream QEMU either.
  


Since we allow overriding cpuid flags, why not the vendor string?  
It's necessary for cpu passthrough.


But we don't allow explicit override of cpuid flags today.  We support 
choosing CPU models which include vendor id and cpuid flags.


Introducing a host CPU model would be acceptable and would more 
accurately achieve cpu passthrough.


--
Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 17/21] Remove #define __user in usb-linux.c

2009-04-30 Thread Anthony Liguori

Avi Kivity wrote:

Anthony Liguori wrote:

This has been consistently nacked in upstream QEMU.

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
---
 usb-linux.c |4 
 1 files changed, 0 insertions(+), 4 deletions(-)

diff --git a/usb-linux.c b/usb-linux.c
index 26643bd..70d7a1c 100644
--- a/usb-linux.c
+++ b/usb-linux.c
@@ -34,10 +34,6 @@
 #include qemu-timer.h
 #include monitor.h
 
-#if defined(__linux__)

-#define __user
-#endif
-
  


This will introduce a regression into qemu-kvm.git.


It won't because -D__user is in CFLAGS.

--
Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 14/21] Remove -cpu-vendor-string

2009-04-30 Thread Avi Kivity

Anthony Liguori wrote:
This isn't in upstream QEMU and is of little utility to KVM.  It's 
unlikely

to appear in upstream QEMU either.
  


Since we allow overriding cpuid flags, why not the vendor string?  
It's necessary for cpu passthrough.


But we don't allow explicit override of cpuid flags today.  We support 
choosing CPU models which include vendor id and cpuid flags.


I think we allow -cpu qemu64,-nx for example.



Introducing a host CPU model would be acceptable and would more 
accurately achieve cpu passthrough.




I agree that -cpu host[,modifiers] is desirable.  But I don't see why 
we shouldn't support finegrained control.


It's probably better done through a -cpu blah,-nx,vendorid=foobar 
rather than a separate option.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 17/21] Remove #define __user in usb-linux.c

2009-04-30 Thread Avi Kivity

Anthony Liguori wrote:

Avi Kivity wrote:

Anthony Liguori wrote:

This has been consistently nacked in upstream QEMU.

-#if defined(__linux__)
-#define __user
-#endif
-
  


This will introduce a regression into qemu-kvm.git.


It won't because -D__user is in CFLAGS.



Ah, ok, will apply.  But that's not in upstream either.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 18/21] Remove host_alarm_timer hacks.

2009-04-30 Thread Anthony Liguori

Avi Kivity wrote:

Anthony Liguori wrote:

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
---
 vl.c |3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/vl.c b/vl.c
index 3b0e3dc..848a8f8 100644
--- a/vl.c
+++ b/vl.c
@@ -1367,8 +1367,7 @@ static void host_alarm_handler(int host_signum)
 last_clock = ti;
 }
 #endif
-if (1 ||
-alarm_has_dynticks(alarm_timer) ||
+if (alarm_has_dynticks(alarm_timer) ||
 (!use_icount 
 qemu_timer_expired(active_timers[QEMU_TIMER_VIRTUAL],
qemu_get_clock(vm_clock))) ||
  


This was added to fix a problem.  Have you tested it?


Do you know what problem it fixes?

This goes back a very long time.  IIUC, this was added prior to the IO 
thread as an optimization.  This ensures that any time there's a 
timer, the vcpu is interrupted to allow IO to run.  With non-dynticks, 
there can be spurious timer signals because we problem the timer with a 
fixed frequency.  It's necessary to take this path with dynticks because 
we need to rearm the timer which happens in the IO path.  It's not 
necessary to take this path with a non-dynticks timer unless there's 
been an expiration.


In modern KVM, the IO thread is capable of interrupting the CPU whenever 
it needs to process IO.  Therefore this problem no longer exists.


Regards,

Anthony Liguori


--
Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 17/21] Remove #define __user in usb-linux.c

2009-04-30 Thread Anthony Liguori

Avi Kivity wrote:

Ah, ok, will apply.  But that's not in upstream either.


Nope, but one step at a time.

--
Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 18/21] Remove host_alarm_timer hacks.

2009-04-30 Thread Avi Kivity

Anthony Liguori wrote:

Avi Kivity wrote:

Anthony Liguori wrote:

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
---
 vl.c |3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/vl.c b/vl.c
index 3b0e3dc..848a8f8 100644
--- a/vl.c
+++ b/vl.c
@@ -1367,8 +1367,7 @@ static void host_alarm_handler(int host_signum)
 last_clock = ti;
 }
 #endif
-if (1 ||
-alarm_has_dynticks(alarm_timer) ||
+if (alarm_has_dynticks(alarm_timer) ||
 (!use_icount 
 qemu_timer_expired(active_timers[QEMU_TIMER_VIRTUAL],
qemu_get_clock(vm_clock))) ||
  


This was added to fix a problem.  Have you tested it?


Do you know what problem it fixes?

This goes back a very long time.  IIUC, this was added prior to the IO 
thread as an optimization.  This ensures that any time there's a 
timer, the vcpu is interrupted to allow IO to run.  With non-dynticks, 
there can be spurious timer signals because we problem the timer with 
a fixed frequency.  It's necessary to take this path with dynticks 
because we need to rearm the timer which happens in the IO path.  It's 
not necessary to take this path with a non-dynticks timer unless 
there's been an expiration.


In modern KVM, the IO thread is capable of interrupting the CPU 
whenever it needs to process IO.  Therefore this problem no longer 
exists.




It would still be good to verify that the problem no longer exists.  
This is not a cosmetic change; some testing is needed to verify it 
doesn't introduce new latencies.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 14/21] Remove -cpu-vendor-string

2009-04-30 Thread Anthony Liguori

Avi Kivity wrote:

Anthony Liguori wrote:
This isn't in upstream QEMU and is of little utility to KVM.  It's 
unlikely

to appear in upstream QEMU either.
  


Since we allow overriding cpuid flags, why not the vendor string?  
It's necessary for cpu passthrough.


But we don't allow explicit override of cpuid flags today.  We 
support choosing CPU models which include vendor id and cpuid flags.


I think we allow -cpu qemu64,-nx for example.


Funny enough, -cpu qemu64,vendor=AuthenticAMD already works today.  So 
yeah, there's no reason to carry -cpu-vendor-string anymore.


--
Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/21] Remove use of signalfd in block-raw-posix.c

2009-04-30 Thread Anthony Liguori

Avi Kivity wrote:

Anthony Liguori wrote:

Avi Kivity wrote:

Anthony Liguori wrote:
We don't use signalfd in upstream QEMU.  Instead, we always emulate 
it.
  


With an extra thread - so an extra context switch.


We don't use an extra thread.  We just install a signal handler that 
writes to a pipe.  At best, the added overhead is that we get EINTRs 
more often but this is something we already handle.


Oh okay.  But signal delivery is slow; for example the FPU needs to be 
reset.


Is it really justified to add all of this extra code (including signalfd 
emulation) for something that probably isn't even measurable?


I like using wiz-bang features of Linux as much as the next guy, but I 
think we're stretching to justify it here :-)


--
Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] Make savevm versioning compatible with upstream QEMU

2009-04-30 Thread Avi Kivity

Anthony Liguori wrote:

Right now, there is no way savevm versioning can be compatible with upstream
QEMU because KVM adds fields to existing savevm structures without incrementing
the versions.

If you assume that KVM will eventually merge into upstream QEMU, this means that
eventually KVM is going to have to break backwards compatibility with itself
to resolve this issue in a non-graceful way.

So let's do that now instead of doing it later when the situation is only worse.

I'm happy to allocate particular version identifiers for KVM to avoid future
conflicts.  I believe we should try to eliminate the existing differences so
that we can converge in the future on a common versioning scheme.
  


Applied both, thanks.

I think we can avoid the need to synchronize too much by saving 
kvm-specific state for device x using id x-kvm; this allows the two 
to evolve independently.  Of course it's much better to avoid divergence 
in the first place, but this isn't always possible.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 14/21] Remove -cpu-vendor-string

2009-04-30 Thread Avi Kivity

Anthony Liguori wrote:
This isn't in upstream QEMU and is of little utility to KVM.  It's 
unlikely

to appear in upstream QEMU either.
  


Since we allow overriding cpuid flags, why not the vendor string?  
It's necessary for cpu passthrough.


But we don't allow explicit override of cpuid flags today.  We 
support choosing CPU models which include vendor id and cpuid flags.


I think we allow -cpu qemu64,-nx for example.


Funny enough, -cpu qemu64,vendor=AuthenticAMD already works today.  
So yeah, there's no reason to carry -cpu-vendor-string anymore.




Applied, but had to reverse the sense of the commit log :)

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] Make savevm versioning compatible with upstream QEMU

2009-04-30 Thread Anthony Liguori

Avi Kivity wrote:

Anthony Liguori wrote:
Right now, there is no way savevm versioning can be compatible with 
upstream
QEMU because KVM adds fields to existing savevm structures without 
incrementing

the versions.

If you assume that KVM will eventually merge into upstream QEMU, this 
means that
eventually KVM is going to have to break backwards compatibility with 
itself

to resolve this issue in a non-graceful way.

So let's do that now instead of doing it later when the situation is 
only worse.


I'm happy to allocate particular version identifiers for KVM to avoid 
future
conflicts.  I believe we should try to eliminate the existing 
differences so

that we can converge in the future on a common versioning scheme.
  


Applied both, thanks.

I think we can avoid the need to synchronize too much by saving 
kvm-specific state for device x using id x-kvm; this allows the 
two to evolve independently.


I need to add save/restore support to upstream QEMU so this is a good 
excuse to just merge the changes in KVM upstream.  So hopefully this 
will become a non issue.  If something arises and you need more savevm 
state, introduce a new section suffixed or prefixed with kvm.  
Alternatively, ask and I can reserve an ID upstream.


For virtio-net, we just need to get the vnet stuff merged upstream.

--
Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/21] Remove use of signalfd in block-raw-posix.c

2009-04-30 Thread Avi Kivity

Anthony Liguori wrote:


Oh okay.  But signal delivery is slow; for example the FPU needs to 
be reset.


Is it really justified to add all of this extra code (including 
signalfd emulation) for something that probably isn't even measurable?


We don't have to add signalfd emulation; we can simply use signal+pipe 
in that case.


We won't know if it's measurable or not until we measure it (or not).



I like using wiz-bang features of Linux as much as the next guy, but I 
think we're stretching to justify it here :-)




I think it's worth it in this case.  It will become more important in 
time, too.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 18/21] Remove host_alarm_timer hacks.

2009-04-30 Thread Anthony Liguori

Avi Kivity wrote:

Anthony Liguori wrote:
In modern KVM, the IO thread is capable of interrupting the CPU 
whenever it needs to process IO.  Therefore this problem no longer 
exists.




It would still be good to verify that the problem no longer exists.  
This is not a cosmetic change; some testing is needed to verify it 
doesn't introduce new latencies.




N.B. dynticks is the preferred timer in QEMU on Linux.  To even hit this 
code path, you'd have to use an explicit -clock hpet or -clock rtc.  I 
don't have an hpet on my laptop and -clock rtc boots just as fast as it 
did before.


Do we really care about optimizing latency with -clock rtc though?

--
Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [kvm] [PATCH 1/2] Increment virtio-net savevm version to avoid conflict with upstream QEMU.

2009-04-30 Thread Anthony Liguori

Alex Williamson wrote:
On Wed, 2009-04-29 at 15:53 -0500, Anthony Liguori wrote: 
  

-#define VIRTIO_NET_VM_VERSION6
+/* Version 7 has TAP_VNET_HDR support.  This is reserved in upstream QEMU to
+ * avoid future conflict.
+ * We can't assume verisons  7 have TAP_VNET_HDR support until this is merged
+ * in upstream QEMU.
+ */
+#define VIRTIO_NET_VM_VERSION7



It seems like you're saying you're only going to reserve version number
7, and not the 4 bytes of savevm we're using for version 7 here.
Couldn't we fix this by adding a dummy patch to qemu to bump to version
7, and push/pop a 4 byte zero from the savevm?  Then we could change the
code below to = 7.  Qemu should probably puke on a savevm image with
non-zero in this location until the kvm code gets merged.  Looks like
one byte would be more than sufficient if we wanted to make that change
now too.  Thanks,
  


I'd rather just merge vnet into upstream QEMU as quickly as possible.  
All I have to do to reserve a field is just hope noone submits a patch 
incrementing version id until we submit vnet support :-)


--
Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Fix build when objdir != srcdir

2009-04-30 Thread Avi Kivity

Anthony Liguori wrote:

This requires adding the necessary bits to configure to create the directories
and symlinks for libkvm.  It also requires sticking KVM_CFLAGS in
config-host.mak to ensure that it gets the right set of includes for the
kernel headers.
  


Applied, thanks.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM performance vs. Xen

2009-04-30 Thread Andrew Theurer

Avi Kivity wrote:

Andrew Theurer wrote:

Avi Kivity wrote:




What's the typical I/O load (disk and network bandwidth) while the 
tests are running?

This is average thrgoughput:
network:Tx: 79 MB/sec  Rx: 5 MB/sec


MB as in Byte or Mb as in bit?
Byte.  There are 4 x 1 Gb adapters, each handling about 20 MB/sec or 160 
Mbit/sec.



disk:read: 17 MB/sec  write: 40 MB/sec


This could definitely cause the extra load, especially if it's many 
small requests (compared to a few large ones).
I don't have the request sizes at my fingertips, but we have to use a 
lot of disks to support this I/O, so I think it's safe to assume there 
are a lot more requests than a simple large sequential read/write.



The host hardware:
A 2 socket, 8 core Nehalem with SMT, and EPT enabled, lots of 
disks, 4 x

1 GB Ethenret


CPU time measurements with SMT can vary wildly if the system is not 
fully loaded.  If the scheduler happens to schedule two threads on a 
single core, both of these threads will generate less work compared 
to if they were scheduled on different cores.
Understood.  Even if at low loads, the scheduler does the right thing 
and spreads out to all the cores first, once it goes beyond 50% util, 
the CPU util can climb at a much higher rate (compared to a linear 
increase in work) because it then starts scheduling 2 threads per 
core, and each thread can do less work.  I have always wanted 
something which could more accurately show the utilization of a 
processor core, but I guess we have to use what we have today.  I 
will run again with SMT off to see what we get.


On the other hand, without SMT you will get to overcommit much faster, 
so you'll have scheduling artifacts.  Unfortunately there's no good 
answer here (except to improve the SMT scheduler).


Yes, it is.  If there is a lot of I/O, this might be due to the 
thread pool used for I/O.
I have a older patch which makes a small change to posix_aio_thread.c 
by trying to keep the thread pool size a bit lower than it is today.  
I will dust that off and see if it helps.


Really, I think linux-aio support can help here.
Yes, I think that would work for real block devices, but would that help 
for files?  I am using real block devices right now, but it would be 
nice to also see a benefit for files in a file-system.  Or maybe I am 
mis-understanding this, and linux-aio can be used on files?


-Andrew





Yes, there is a scheduler tracer, though I have no idea how to 
operate it.


Do you have kvm_stat logs?
Sorry, I don't, but I'll run that next time.  BTW, I did not notice a 
batch/log mode the last time I ram kvm_stat.  Or maybe it was not 
obvious to me.  Is there an ideal way to run kvm_stat without a 
curses like output?


You're probably using an ancient version:

$ kvm_stat --help
Usage: kvm_stat [options]

Options:
 -h, --helpshow this help message and exit
 -1, --once, --batch   run in batch mode for one second
 -l, --log run in logging mode (like vmstat)
 -f FIELDS, --fields=FIELDS
   fields to display (regex)





--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM performance vs. Xen

2009-04-30 Thread Anthony Liguori

Avi Kivity wrote:


1) I'm seeing about 2.3% in scheduler functions [that I recognize].
Does that seems a bit excessive?


Yes, it is.  If there is a lot of I/O, this might be due to the thread 
pool used for I/O.


This is why I wrote the linux-aio patch.  It only reduced CPU 
consumption by about 2% although I'm not sure if that's absolute or 
relative.  Andrew?



2) cpu_physical_memory_rw due to not using preadv/pwritev?


I think both virtio-net and virtio-blk use memcpy().


With latest linux-2.6, and a development snapshot of glibc, virtio-blk 
will not use memcpy() anymore but virtio-net still does on the receive 
path (but not transmit).


Regards,

Anthony Liguori
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM performance vs. Xen

2009-04-30 Thread Anthony Liguori

Andrew Theurer wrote:


Really, I think linux-aio support can help here.
Yes, I think that would work for real block devices, but would that 
help for files?  I am using real block devices right now, but it would 
be nice to also see a benefit for files in a file-system.  Or maybe I 
am mis-understanding this, and linux-aio can be used on files?


For cache=off, with some file systems, yes.  But not for 
cache=writethrough/writeback.


Regards,

Anthony Liguori
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] KVM: VMX: Disable VMX when system shutdown

2009-04-30 Thread Cihula, Joseph
 From: Avi Kivity [mailto:a...@redhat.com]
 Sent: Thursday, April 30, 2009 5:31 AM

 Sheng Yang wrote:
  Intel TXT(Trusted Execution Technology) required VMX off for all cpu to work
  when system shutdown.
 

 Applied, thanks.

 Is this needed for 2.6.30 and -stable?  That is, is the code that
 enables TXT in 2.6.30 and below or in the BIOS?  Or is it new code not
 yet merged?

The TXT code will not get merged in 2.6.30, though it will hopefully make it 
soon thereafter.  So it would be fine to put it in 2.6.31.

Joe


Re: KVM performance vs. Xen

2009-04-30 Thread Avi Kivity

Andrew Theurer wrote:





disk:read: 17 MB/sec  write: 40 MB/sec


This could definitely cause the extra load, especially if it's many 
small requests (compared to a few large ones).
I don't have the request sizes at my fingertips, but we have to use a 
lot of disks to support this I/O, so I think it's safe to assume there 
are a lot more requests than a simple large sequential read/write.


Yes.  Well the high context switch rate is the scheduler's way of 
telling us to use linux-aio.  If lot's of disks == 100, with a 3ms 
seek time, that's already 60,000 cs/sec.



Really, I think linux-aio support can help here.
Yes, I think that would work for real block devices, but would that 
help for files?  I am using real block devices right now, but it would 
be nice to also see a benefit for files in a file-system.  Or maybe I 
am mis-understanding this, and linux-aio can be used on files?


It could work with files with cache=none (though not qcow2 as now written).

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM performance vs. Xen

2009-04-30 Thread Avi Kivity

Anthony Liguori wrote:



2) cpu_physical_memory_rw due to not using preadv/pwritev?


I think both virtio-net and virtio-blk use memcpy().


With latest linux-2.6, and a development snapshot of glibc, virtio-blk 
will not use memcpy() anymore but virtio-net still does on the receive 
path (but not transmit).


There's still the kernel/user copy, so we have two copies on rx, one on tx.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Fix build when objdir != srcdir

2009-04-30 Thread Avi Kivity

Avi Kivity wrote:

Anthony Liguori wrote:
This requires adding the necessary bits to configure to create the 
directories

and symlinks for libkvm.  It also requires sticking KVM_CFLAGS in
config-host.mak to ensure that it gets the right set of includes for the
kernel headers.
  


Applied, thanks.



Unapplied, as it breaks ordinary ./configure  make.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [KVM-AUTOTEST] [PATCH] support for remote migration

2009-04-30 Thread David Huff

yogi wrote:

Hello everyone,

I like to submit patch to add support for remote migration in
kvm-autotest.



Thanks for the patch, Uri is out on vacation for a while. I'll apply the 
patch to my test repo and do some validation testing, however may be a 
little while untill it makes it in.


-D
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM performance vs. Xen

2009-04-30 Thread Avi Kivity

Anthony Liguori wrote:

Avi Kivity wrote:


1) I'm seeing about 2.3% in scheduler functions [that I recognize].
Does that seems a bit excessive?


Yes, it is.  If there is a lot of I/O, this might be due to the 
thread pool used for I/O.


This is why I wrote the linux-aio patch.  It only reduced CPU 
consumption by about 2% although I'm not sure if that's absolute or 
relative.  Andrew?


Was that before or after the entire path was made copyless?

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM performance vs. Xen

2009-04-30 Thread Andrew Theurer

Avi Kivity wrote:

Anthony Liguori wrote:

Avi Kivity wrote:


1) I'm seeing about 2.3% in scheduler functions [that I recognize].
Does that seems a bit excessive?


Yes, it is.  If there is a lot of I/O, this might be due to the 
thread pool used for I/O.


This is why I wrote the linux-aio patch.  It only reduced CPU 
consumption by about 2% although I'm not sure if that's absolute or 
relative.  Andrew?
If  I recall correctly, it was 2.4% and relative.  But with 2.3% in 
scheduler functions, that's what I expected.


Was that before or after the entire path was made copyless?
If this is referring to the preadv/writev support, no, I have not tested 
with that.


-Andrew


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/04] qemu-kvm: Remove the dependency for phys_ram_base for ipf.c

2009-04-30 Thread Jes Sorensen

Avi Kivity wrote:

Jes Sorensen wrote:
I pushed my queue into a branch (named 'queue').  Will merge once I 
resolve the regressions here.




Hi Avi,

I don't see that branch - it's in the qemu-kvm repo?

Cheers,
Jes

[...@leavenworth qemu-kvm]$ git branch -a
* master
  origin/HEAD
  origin/bios-merge
  origin/bios-patchqueue
  origin/bochs-bios-cvs
  origin/bochs-bios-vendor-drops
  origin/build
  origin/for-glommer
  origin/ia64-vtd
  origin/irq-routing-2
  origin/kvm-updates-2.6.25
  origin/kvm-updates-2.6.26
  origin/kvm-updates-2.6.27
  origin/kvm-updates/2.6.26
  origin/kvm-updates/2.6.27
  origin/kvm-updates/2.6.28
  origin/kvm-updates/2.6.29
  origin/kvm-updates/2.6.30
  origin/maint/2.6.25
  origin/maint/2.6.26
  origin/maint/2.6.26-test
  origin/maint/2.6.28
  origin/maint/2.6.29
  origin/maint/2.6.30
  origin/master
  origin/merge-tmp
  origin/origin
  origin/pending
  origin/qemu-cvs
  origin/qemu-vendor-drops
  origin/realmode
  origin/release
[...@leavenworth qemu-kvm]$
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [KVM-AUTOTEST] [PATCH] support for remote migration

2009-04-30 Thread Michael Goldish

- yogi anant...@linux.vnet.ibm.com wrote:

 Hello everyone,
 
 I like to submit patch to add support for remote migration in
 kvm-autotest.

Sounds like a good idea.

Also, the patch isn't too big, which I personally appreciate very much (makes 
it easier to read).

 To use this patch the following four parameters should be added to
 the
 existing migration test
 
 remote = dst
 hostip = localhost ip or name
 remoteip = remote host ip or name
 remuser = root
 rempassword = password
 
 the field remote=dst indicates the VM dst should be created on
 remote
 machine.
 For example:
 - migrate:  install setup
 type = migration
 vms +=  dst
 migration_test_command = help
 kill_vm_on_error = yes
 remote = dst
 hostip = 192.168.1.2
 remoteip = 192.168.1.3
 remuser = root
 rempassword = 123456
 variants:
 
 Three files r being modified in this patch kvm_utils.py, kvm_tests.py
 and kvm_vm.py.
 kvm_utils.py - if the  ssh-keys have been exchanged between the test
 machines,then remote login fails with message Got unexpected login
 prompt, to prevent this, have made it return a session rather then
 None
 
 kvm_tests.py - the host address used in migration is made dynamic
 
 kvm_vm.py -have replaced unix sockets with tcp sockets for
 monitor,
 in both remote and local VM. Added two new variables(remote,ssh_port)
 to
 class VM,remote set to True if the VM is on a  remote
 machine,ssh_port
 contains the redirection port, funtion get_address() returns the ip
 of
 the host whr the VM is(local or remote).

I've only looked at the code briefly, and it looks very good overall, but I 
have a few comments/questions:

Regarding remote_login:

- Why should remote_login return a session when it gets an unexpected login 
prompt? If you get a login prompt doesn't that mean something went wrong? The 
username is always provided in the ssh command line, so we shouldn't expect to 
receive a login prompt -- or am I missing something? I am pretty confident this 
is true in the general case, but maybe it's different when ssh keys have been 
exchanged between the hosts.

- I think it makes little sense to return a session object when you see a login 
prompt because that session will be useless. You can't send any commands to it 
because you don't have a shell prompt yet. Any command you send will be 
interpreted as a username, and will most likely be the wrong username.

- When a guest is in the process of booting and we try to log into it, 
remote_login sometimes fails because it gets an unexpected login prompt. This 
is good, as far as I understand, because it means the guest isn't ready yet 
(still booting). The next time remote_login attempts to log in, it usually 
succeeds. If we consider an unexpected login prompt OK, we pass login attempts 
that actually should have failed (and the resulting sessions will be useless 
anyway).

Other things:

- If I understand correctly, remote migration will only work if the remote qemu 
binary path is exactly the same as the local one. Maybe we should receive a 
qemu path parameter that will allow for some flexibility.

- In VM.make_qemu_command(), in the code that handles redirections, you add 
'self.ssh_port = host_port'. I don't think this is correct because there can be 
multiple redirections, unrelated to SSH, so you certainly shouldn't assume that 
the only redirection is an SSH one. When you want the host port redirected to 
the guest's SSH port, you should use 
self.get_port(int(self.params.get(ssh_port))). This will also work if for 
some reason 'ssh_port' changes while the guest is alive.

- It seems that the purpose of 'remote = dst' is to indicate to 'dst' that it 
should be started as a remote VM. The preferred way to do this is to pass 
something like 'remote_dst = yes' and then in VM.create() you can test for 
params.get(remote) == yes. See Addressing objects in the wiki 
(http://www.linux-kvm.org/page/KVM-Autotest/Parameters#Addressing_objects_.28VMs.2C_images.2C_NICs_etc.29).
In general, any parameter you want to pass to a specific VM, you pass using 
param_vmname = value, e.g. 'mem_dst = 128', and then in VM.create() the 
parameter is accessible without the VM name extension (e.g. 
self.params.get(mem) will equal 128).

Thanks,
Michael
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Fix build when objdir != srcdir (v2)

2009-04-30 Thread Anthony Liguori
This requires adding the necessary bits to configure to create the directories
and symlinks for libkvm.  It also requires sticking KVM_CFLAGS in
config-host.mak to ensure that it gets the right set of includes for the
kernel headers.

v1 = v2
  Fix build when objdir == srcdir

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
---
 configure   |   10 --
 kvm/libkvm/Makefile |4 +++-
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/configure b/configure
index fc0fb9b..c41e269 100755
--- a/configure
+++ b/configure
@@ -518,7 +518,7 @@ if test $werror = yes ; then
 CFLAGS=$CFLAGS -Werror
 fi
 
-CFLAGS=$CFLAGS -I$(readlink -f kvm/libkvm)
+CFLAGS=$CFLAGS -I$(readlink -f $source_path/kvm/libkvm)
 
 if test $solaris = no ; then
 if ld --version 2/dev/null | grep GNU ld /dev/null 2/dev/null ; then
@@ -1785,6 +1785,11 @@ bsd)
 ;;
 esac
 
+# this is a temp hack needed for libkvm
+if test $kvm = yes ; then
+echo KVM_CFLAGS=$kvm_cflags  $config_mak
+fi
+
 tools=
 if test `expr $target_list : .*softmmu.*` != 0 ; then
   tools=qemu-img\$(EXESUF) $tools
@@ -2162,10 +2167,11 @@ done # for target in $targets
 
 # build tree in object directory if source path is different from current one
 if test $source_path_used = yes ; then
-DIRS=tests tests/cris slirp audio
+DIRS=tests tests/cris slirp audio kvm/libkvm
 FILES=Makefile tests/Makefile
 FILES=$FILES tests/cris/Makefile tests/cris/.gdbinit
 FILES=$FILES tests/test-mmap.c
+FILES=$FILES kvm/libkvm/Makefile
 for dir in $DIRS ; do
 mkdir -p $dir
 done
diff --git a/kvm/libkvm/Makefile b/kvm/libkvm/Makefile
index 727ce48..2f2cfa2 100644
--- a/kvm/libkvm/Makefile
+++ b/kvm/libkvm/Makefile
@@ -1,5 +1,5 @@
 include ../../config-host.mak
-include config-$(ARCH).mak
+include $(VPATH)/kvm/libkvm/config-$(ARCH).mak
 
 # libkvm is not -Wredundant-decls friendly yet
 CFLAGS += -Wno-redundant-decls
@@ -18,6 +18,8 @@ LDFLAGS += $(CFLAGS)
 
 CXXFLAGS = $(autodepend-flags)
 
+VPATH:=$(VPATH)/kvm/libkvm
+
 autodepend-flags = -MMD -MF $(dir $*).$(notdir $*).d
 
 
-- 
1.6.0.6

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Fix build when objdir != srcdir

2009-04-30 Thread Anthony Liguori

Avi Kivity wrote:

Avi Kivity wrote:

Anthony Liguori wrote:
This requires adding the necessary bits to configure to create the 
directories

and symlinks for libkvm.  It also requires sticking KVM_CFLAGS in
config-host.mak to ensure that it gets the right set of includes for 
the

kernel headers.
  


Applied, thanks.



Unapplied, as it breaks ordinary ./configure  make.


Doh, sorry.  Sent a new patch fixing this.

--
Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM performance vs. Xen

2009-04-30 Thread Anthony Liguori

Avi Kivity wrote:

Anthony Liguori wrote:



2) cpu_physical_memory_rw due to not using preadv/pwritev?


I think both virtio-net and virtio-blk use memcpy().


With latest linux-2.6, and a development snapshot of glibc, 
virtio-blk will not use memcpy() anymore but virtio-net still does on 
the receive path (but not transmit).


There's still the kernel/user copy, so we have two copies on rx, one 
on tx.


That won't show up as cpu_physical_memory_rw.  stl_phys/ldl_phys are 
suspect though as they degrade to cpu_physical_memory_rw.


Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM performance vs. Xen

2009-04-30 Thread Anthony Liguori

Avi Kivity wrote:

Anthony Liguori wrote:

Avi Kivity wrote:


1) I'm seeing about 2.3% in scheduler functions [that I recognize].
Does that seems a bit excessive?


Yes, it is.  If there is a lot of I/O, this might be due to the 
thread pool used for I/O.


This is why I wrote the linux-aio patch.  It only reduced CPU 
consumption by about 2% although I'm not sure if that's absolute or 
relative.  Andrew?


Was that before or after the entire path was made copyless?


Before so it's worth updating and trying again.

Regards,

Anthony Liguori
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 16/21] Remove clean rule change

2009-04-30 Thread Hollis Blanchard
On Thu, 2009-04-30 at 12:42 +0300, Avi Kivity wrote:
 Anthony Liguori wrote:
  It's not in upstream QEMU so apparently it's not useful.
 
  Signed-off-by: Anthony Liguori aligu...@us.ibm.com
  ---
   pc-bios/Makefile |2 +-
   1 files changed, 1 insertions(+), 1 deletions(-)
 
  diff --git a/pc-bios/Makefile b/pc-bios/Makefile
  index dabeb4c..315288d 100644
  --- a/pc-bios/Makefile
  +++ b/pc-bios/Makefile
  @@ -16,4 +16,4 @@ all: $(TARGETS)
  dtc -I dts -O dtb -o $@ $
   
   clean:
  -   rm -f $(TARGETS) *.o *~ *.dtb
  +   rm -f $(TARGETS) *.o *~

 
 Hollis?

dtb is the compiled (binary) form of dts (source) device tree files.

Think of it like bios.bin: if make clean doesn't delete bios.bin (and it
looks like it doesn't), neither should it delete *.dtb, and we can drop
the patch.

Acked-by: Hollis Blanchard holl...@us.ibm.com

-- 
Hollis Blanchard
IBM Linux Technology Center

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM performance vs. Xen

2009-04-30 Thread Anthony Liguori

Andrew Theurer wrote:

Avi Kivity wrote:

Anthony Liguori wrote:

Avi Kivity wrote:


1) I'm seeing about 2.3% in scheduler functions [that I recognize].
Does that seems a bit excessive?


Yes, it is.  If there is a lot of I/O, this might be due to the 
thread pool used for I/O.


This is why I wrote the linux-aio patch.  It only reduced CPU 
consumption by about 2% although I'm not sure if that's absolute or 
relative.  Andrew?
If  I recall correctly, it was 2.4% and relative.  But with 2.3% in 
scheduler functions, that's what I expected.


Was that before or after the entire path was made copyless?
If this is referring to the preadv/writev support, no, I have not 
tested with that.


Previously, the block API only exposed non-vector interfaces and bounced 
vectored operations to a linear buffer.  That's been eliminated now 
though so we need to update the linux-aio patch to implement a vectored 
backend interface.


However, it is an apples to apples comparison in terms of copying since 
the same is true with the thread pool.  My take away was that the thread 
pool overhead isn't the major source of issues.


Regards,

Anthony Liguori
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: CPU Limitations

2009-04-30 Thread Cornelius Wefelscheid
Hi,
i tried to get some useful informations out of gdb.
but it just gives me this:

warning: Can't read pathname for load map: Input/output error.
Reading symbols from /lib/libm.so.6...(no debugging symbols
found)...done.
Loaded symbols for /lib/libm.so.6
Reading symbols from /lib/libz.so.1...(no debugging symbols
found)...done.
Loaded symbols for /lib/libz.so.1
Reading symbols from /usr/lib/libasound.so.2...
(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libasound.so.2
Reading symbols from /usr/lib/libpulse-simple.so.0...(no debugging
symbols found)...done.
Loaded symbols for /usr/lib/libpulse-simple.so.0
Reading symbols from /usr/lib/libgnutls.so.26...
(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libgnutls.so.26
Reading symbols from /lib/libpthread.so.0...(no debugging symbols
found)...done.
Loaded symbols for /lib/libpthread.so.0
Reading symbols from /lib/librt.so.1...
(no debugging symbols found)...done.
Loaded symbols for /lib/librt.so.1
Reading symbols from /lib/libutil.so.1...(no debugging symbols
found)...done.
Loaded symbols for /lib/libutil.so.1
Reading symbols from /usr/lib/libX11.so.6...
(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libX11.so.6
Reading symbols from /usr/lib/libSDL-1.2.so.0...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib/libSDL-1.2.so.0
Reading symbols from /lib/libncurses.so.5...
(no debugging symbols found)...done.
Loaded symbols for /lib/libncurses.so.5
Reading symbols from /lib/libc.so.6...(no debugging symbols
found)...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /usr/lib/libpulse.so.0...
(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libpulse.so.0
Reading symbols from /lib/libdl.so.2...(no debugging symbols
found)...done.
Loaded symbols for /lib/libdl.so.2
Reading symbols from /usr/lib/libSM.so.6...
(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libSM.so.6
Reading symbols from /usr/lib/libICE.so.6...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib/libICE.so.6
Reading symbols from /lib/libcap.so.2...
(no debugging symbols found)...done.
Loaded symbols for /lib/libcap.so.2
Reading symbols from /usr/lib/libgdbm.so.3...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib/libgdbm.so.3
Reading symbols from /usr/lib/libtasn1.so.3...
(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libtasn1.so.3
Reading symbols from /lib/libgcrypt.so.11...(no debugging symbols
found)...done.
Loaded symbols for /lib/libgcrypt.so.11
Reading symbols from /lib/ld-linux-x86-64.so.2...
(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /usr/lib/libxcb.so.1...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib/libxcb.so.1
Reading symbols from /usr/lib/libdirectfb-1.0.so.0...
(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libdirectfb-1.0.so.0
Reading symbols from /usr/lib/libfusion-1.0.so.0...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib/libfusion-1.0.so.0
Reading symbols from /usr/lib/libdirect-1.0.so.0...
(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libdirect-1.0.so.0
Reading symbols from /lib/libuuid.so.1...(no debugging symbols
found)...done.
Loaded symbols for /lib/libuuid.so.1
Reading symbols from /lib/libattr.so.1...
(no debugging symbols found)...done.
Loaded symbols for /lib/libattr.so.1
Reading symbols from /lib/libgpg-error.so.0...(no debugging symbols
found)...done.
Loaded symbols for /lib/libgpg-error.so.0
Reading symbols from /usr/lib/libXau.so.6...
(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libXau.so.6
Reading symbols from /usr/lib/libXdmcp.so.6...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib/libXdmcp.so.6
Reading symbols from /lib/libnss_files.so.2...
---Type return to continue, or q return to quit---
(no debugging symbols found)...done.
Loaded symbols for /lib/libnss_files.so.2
(no debugging symbols found)
Core was generated by `kvm -smp 32 /fs2/xen/disk0'.
Program terminated with signal 11, Segmentation fault.
[New process 4665]
[New process 4666]
[New process 4674]
[New process 4670]
[New process 4676]
[New process 4678]
[New process 4669]
[New process 4667]
[New process 4677]
[New process 4686]
[New process 4675]
[New process 4672]
[New process 4679]
[New process 4682]
[New process 4673]
[New process 4681]
[New process 4671]
[New process 4683]
[New process 4689]
[New process 4685]
[New process 4668]
[New process 4690]
[New process 4684]
[New process 4691]
[New process 4687]
[New process 4692]
[New process 4693]
[New process 4694]
[New process 4695]
[New process 4696]
[New process 4680]
[New process 4688]
[New process 4697]
#0  0x004092ba in ?? ()



Do i maybe need to compile KVM with some special debug flags?
Is there no patch that increases the number of CPUS?

Cheers,
Cornelius



Am Dienstag, den 28.04.2009, 11:41 +0300 schrieb Avi Kivity:
 

Re: [PATCH 16/21] Remove clean rule change

2009-04-30 Thread Avi Kivity

Hollis Blanchard wrote:

dtb is the compiled (binary) form of dts (source) device tree files.

Think of it like bios.bin: if make clean doesn't delete bios.bin (and it
looks like it doesn't), neither should it delete *.dtb, and we can drop
the patch.

Acked-by: Hollis Blanchard holl...@us.ibm.com
  


make clean doesn't delete bios.bin, because bios.bin is under source 
control (as it requires special tools to build).


I see that *.dtb is also under source control, so will apply the patch.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM performance vs. Xen

2009-04-30 Thread Avi Kivity

Anthony Liguori wrote:


Previously, the block API only exposed non-vector interfaces and 
bounced vectored operations to a linear buffer.  That's been 
eliminated now though so we need to update the linux-aio patch to 
implement a vectored backend interface.


However, it is an apples to apples comparison in terms of copying 
since the same is true with the thread pool.  My take away was that 
the thread pool overhead isn't the major source of issues.


If the overhead is dominated by copying, then you won't see the 
difference.  Once the copying is eliminated, the comparison may yield 
different results.  We should certainly see a difference in context 
switches.


One cause of context switches won't be eliminated - the non-saturating 
workload causes us to switch to the idle thread, which incurs a 
heavyweight exit.  This doesn't matter since we're idle anyway, but when 
we switch back, we incur a heavyweight entry.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: CPU Limitations

2009-04-30 Thread Avi Kivity

Cornelius Wefelscheid wrote:

Hi,
i tried to get some useful informations out of gdb.
but it just gives me this:

#0  0x004092ba in ?? ()



Do i maybe need to compile KVM with some special debug flags?
Is there no patch that increases the number of CPUS?
  


Use 'gdb /path/to/qemu core_file'


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: qemu-kvm.git now live

2009-04-30 Thread Jan Kiszka
Avi Kivity wrote:
 Jan Kiszka wrote:
 Avi Kivity wrote:
  
 Where/how does the
 migration code disable dirty logging?
   
 Should be phase 3 of ram_save_live().
 
 But only in qemu-kvm. What is the plan about pushing it upstream? Then
 we could discuss how to extend the exiting support best.
 
 Pushing things upstream is quite difficult because of the very different
 infrastructure.
 

 Isn't the midterm goal to get rid of most of these differences (namely
 libkvm)?
   
 
 Yes, but not by removing existing functionality.

No one said this.

 
  
 It's unfortunate that upstream rewrote everything
 instead of changing things incrementally.  Rewrites are almost always a
 mistake since they throw away accumulated knowledge.
 

 I disagree, at least in this particular case. Upstream already diverged
 from qemu-kvm, and the latter provided no comparable alternative for
 slot management and dirty logging. And I still don't see that we lost
 anything that could not easily be re-integrated into upstream (ie.
 global dirty logging), finally leading to a cleaner and more complete
 result.
   
 
 It could have been done differently, by morphing the existing support
 into something mergable, and merging that.  In this way, we'd ensure no
 needed functionality is lost.

The existing support lacked features upstream already had and instead
required additional hacks to make qemu-kvm work.

 
 As is, we're adding something simple, then discovering it's
 insufficient.  We're throwing away information, that's not a good way to
 make progress.

I doubt this applies here.

 
 So, what bits are missing to make KVM migration work in upstream?
   
 
 I don't know of anything beyond dirty logging.
 

OK, then I will pick this up and have a look at something comparable to
cpu_physical_memory_set_dirty_tracking() for upstream.

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 18/21] Remove host_alarm_timer hacks.

2009-04-30 Thread Anthony Liguori

Avi Kivity wrote:


Do we really care about optimizing latency with -clock rtc though?



People still run kvm on RHEL 5 (or cheap clones thereof), aren't they 
affected?


Do they use -clock rtc?  -clock dynticks should still work on RHEL 5 
it's just that you won't get very accurate timer events.


You can only use -clock rtc with a single guest at a time so I doubt 
people use it seriously.  The other option would be -clock unix but I 
can't see why you'd use -clock unix instead of -clock dynticks.


The only reason to keep -clock unix around is for non Linux unices.

--
Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 18/21] Remove host_alarm_timer hacks.

2009-04-30 Thread Avi Kivity

Anthony Liguori wrote:

Avi Kivity wrote:

Anthony Liguori wrote:
In modern KVM, the IO thread is capable of interrupting the CPU 
whenever it needs to process IO.  Therefore this problem no longer 
exists.




It would still be good to verify that the problem no longer exists.  
This is not a cosmetic change; some testing is needed to verify it 
doesn't introduce new latencies.




N.B. dynticks is the preferred timer in QEMU on Linux.  To even hit 
this code path, you'd have to use an explicit -clock hpet or -clock 
rtc.  I don't have an hpet on my laptop and -clock rtc boots just as 
fast as it did before.


I'll apply this and see what happens.



Do we really care about optimizing latency with -clock rtc though?



People still run kvm on RHEL 5 (or cheap clones thereof), aren't they 
affected?


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >