[PATCH] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6

2009-02-18 Thread Avi Kivity
From: Avi Kivity a...@redhat.com

Conflicts:
arch/x86/include/asm/kvm.h

Signed-off-by: Avi Kivity a...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe kvm-commits in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm: qemu: fix host_cpuid() on i386

2009-02-18 Thread Avi Kivity
From: Avi Kivity a...@redhat.com

The addition of the ecx parameter broke cpuid on i386 as the constraints
changed.

Signed-off-by: Avi Kivity a...@redhat.com

diff --git a/qemu/target-i386/helper.c b/qemu/target-i386/helper.c
index 08e26bf..6f20e9d 100644
--- a/qemu/target-i386/helper.c
+++ b/qemu/target-i386/helper.c
@@ -1425,10 +1425,10 @@ static void host_cpuid(uint32_t function, uint32_t 
count,
 #else
 asm volatile(pusha \n\t
  cpuid \n\t
- mov %%eax, 0(%1) \n\t
- mov %%ebx, 4(%1) \n\t
- mov %%ecx, 8(%1) \n\t
- mov %%edx, 12(%1) \n\t
+ mov %%eax, 0(%2) \n\t
+ mov %%ebx, 4(%2) \n\t
+ mov %%ecx, 8(%2) \n\t
+ mov %%edx, 12(%2) \n\t
  popa
  : : a(function), c(count), S(vec)
  : memory, cc);
--
To unsubscribe from this list: send the line unsubscribe kvm-commits in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Current KVM head crashes on startup

2009-02-18 Thread Amit Shah
On (Wed) Feb 18 2009 [13:21:26], Amit Shah wrote:
 On (Tue) Feb 17 2009 [12:47:10], Brian Kress wrote:
  When I try to run KVM built off the current head, it crashes with a  
  Segmentation fault.  KVM-84 does
  not.  Seems to be dealing with the CPUID changes:
 
 
 0x081a5c70 in host_cpuid ()
 at /home/kressb/kvm/src/qemu/target-i386/helper.c:1426
 1426asm volatile(pusha \n\t
 
 This looks like some kind of stack corruption on 32-bit:
 
 1472if (kvm_enabled())
 (gdb)
 1473host_cpuid(0, 0, NULL, ebx, ecx, edx);
 (gdb)
 
 Program received signal SIGSEGV, Segmentation fault.
 0x081a2d60 in host_cpuid (function=10, count=1231384169, eax=0x0, 
 ebx=0xadfc1914,
 ecx=0xadfc1910, edx=0xadfc190c)
 at /home/amit/src/kvm-userspace/qemu/target-i386/helper.c:1426
 1426asm volatile(pusha \n\t
 
 I don't see this on 64-bit. Investigating.

Avi, what's the reason for doing this in the host_cpuid code? As I see
it, the first version should work for both 64-bit and 32-bit code.

#ifdef __x86_64__
asm volatile(cpuid
 : =a(vec[0]), =b(vec[1]),
   =c(vec[2]), =d(vec[3])
 : 0(function), c(count) : cc);
#else
asm volatile(pusha \n\t
 cpuid \n\t
 mov %%eax, 0(%1) \n\t
 mov %%ebx, 4(%1) \n\t
 mov %%ecx, 8(%1) \n\t
 mov %%edx, 12(%1) \n\t
 popa
 : : a(function), c(count), S(vec)
 : memory, cc);
#endif

Amit
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Current KVM head crashes on startup

2009-02-18 Thread Avi Kivity

Amit Shah wrote:

On (Wed) Feb 18 2009 [13:21:26], Amit Shah wrote:
  

On (Tue) Feb 17 2009 [12:47:10], Brian Kress wrote:

When I try to run KVM built off the current head, it crashes with a  
Segmentation fault.  KVM-84 does

not.  Seems to be dealing with the CPUID changes:


   0x081a5c70 in host_cpuid ()
   at /home/kressb/kvm/src/qemu/target-i386/helper.c:1426
   1426asm volatile(pusha \n\t
  

This looks like some kind of stack corruption on 32-bit:

1472if (kvm_enabled())
(gdb)
1473host_cpuid(0, 0, NULL, ebx, ecx, edx);
(gdb)

Program received signal SIGSEGV, Segmentation fault.
0x081a2d60 in host_cpuid (function=10, count=1231384169, eax=0x0, 
ebx=0xadfc1914,
ecx=0xadfc1910, edx=0xadfc190c)
at /home/amit/src/kvm-userspace/qemu/target-i386/helper.c:1426
1426asm volatile(pusha \n\t

I don't see this on 64-bit. Investigating.



Avi, what's the reason for doing this in the host_cpuid code? As I see
it, the first version should work for both 64-bit and 32-bit code.

#ifdef __x86_64__
asm volatile(cpuid
 : =a(vec[0]), =b(vec[1]),
   =c(vec[2]), =d(vec[3])
 : 0(function), c(count) : cc);
#else
asm volatile(pusha \n\t
 cpuid \n\t
 mov %%eax, 0(%1) \n\t
 mov %%ebx, 4(%1) \n\t
 mov %%ecx, 8(%1) \n\t
 mov %%edx, 12(%1) \n\t
 popa
 : : a(function), c(count), S(vec)
 : memory, cc);
#endif
  


The first version generates too much register pressure for some 
compilers on i386, leading to compilation failures.  The second version 
is surely wrong, though?  Counting from zero, the vec parameter would 
be %2, not %1.



(copied Anthony)

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Current KVM head crashes on startup

2009-02-18 Thread Amit Shah
On (Wed) Feb 18 2009 [08:49:33], Avi Kivity wrote:
 Amit Shah wrote:
 On (Wed) Feb 18 2009 [13:21:26], Amit Shah wrote:
   
 On (Tue) Feb 17 2009 [12:47:10], Brian Kress wrote:
 
 When I try to run KVM built off the current head, it crashes with a 
  Segmentation fault.  KVM-84 does
 not.  Seems to be dealing with the CPUID changes:


0x081a5c70 in host_cpuid ()
at /home/kressb/kvm/src/qemu/target-i386/helper.c:1426
1426asm volatile(pusha \n\t
   
 This looks like some kind of stack corruption on 32-bit:

 1472if (kvm_enabled())
 (gdb)
 1473host_cpuid(0, 0, NULL, ebx, ecx, edx);
 (gdb)

 Program received signal SIGSEGV, Segmentation fault.
 0x081a2d60 in host_cpuid (function=10, count=1231384169, eax=0x0, 
 ebx=0xadfc1914,
 ecx=0xadfc1910, edx=0xadfc190c)
 at /home/amit/src/kvm-userspace/qemu/target-i386/helper.c:1426
 1426asm volatile(pusha \n\t

 I don't see this on 64-bit. Investigating.
 

 Avi, what's the reason for doing this in the host_cpuid code? As I see
 it, the first version should work for both 64-bit and 32-bit code.

 #ifdef __x86_64__
 asm volatile(cpuid
  : =a(vec[0]), =b(vec[1]),
=c(vec[2]), =d(vec[3])
  : 0(function), c(count) : cc);
 #else
 asm volatile(pusha \n\t
  cpuid \n\t
  mov %%eax, 0(%1) \n\t
  mov %%ebx, 4(%1) \n\t
  mov %%ecx, 8(%1) \n\t
  mov %%edx, 12(%1) \n\t
  popa
  : : a(function), c(count), S(vec)
  : memory, cc);
 #endif
   

 The first version generates too much register pressure for some  
 compilers on i386, leading to compilation failures.  The second version  

Is it still valid? I tried with gcc-4.1.2 and that worked fine with the
first version. Should we just use that version instead?

 is surely wrong, though?  Counting from zero, the vec parameter would  
 be %2, not %1.

Looks like I missed out updating that when I introduced 'count'. Fixing
that fixes the problem.

Amit
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/3 v3] MSI-X enabling

2009-02-18 Thread Sheng Yang
Updated the patchset followed Marcelo and Avi's comments.

Please also review MSI/MSI-X userspace patch as well.

Thanks!
--
regards
Yang, Sheng
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] KVM: Enable MSI-X for KVM assigned device

2009-02-18 Thread Sheng Yang
This patch finally enable MSI-X.

What we need for MSI-X:
1. Intercept one page in MMIO region of device. So that we can get guest desired
MSI-X table and set up the real one. Now this have been done by guest, and
transfer to kernel using ioctl KVM_SET_MSIX_NR and KVM_SET_MSIX_ENTRY.

2. Information for incoming interrupt. Now one device can have more than one
interrupt, and they are all handled by one workqueue structure. So we need to
identify them. The previous patch enable gsi_msg_pending_bitmap get this done.

3. Mapping from host IRQ to guest gsi as well as guest gsi to real MSI/MSI-X
message address/data. We used same entry number for the host and guest here, so
that it's easy to find the correlated guest gsi.

What we lack for now:
1. The PCI spec said nothing can existed with MSI-X table in the same page of
MMIO region, except pending bits. The patch ignore pending bits as the first
step (so they are always 0 - no pending).

2. The PCI spec allowed to change MSI-X table dynamically. That means, the OS
can enable MSI-X, then mask one MSI-X entry, modify it, and unmask it. The patch
didn't support this, and Linux also don't work in this way.

3. The patch didn't implement MSI-X mask all and mask single entry. I would
implement the former in driver/pci/msi.c later. And for single entry, userspace
should have reposibility to handle it.

Signed-off-by: Sheng Yang sh...@linux.intel.com
---
 include/linux/kvm.h |8 
 virt/kvm/kvm_main.c |  107 ---
 2 files changed, 109 insertions(+), 6 deletions(-)

diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index a2dfbe0..78480d0 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -440,6 +440,9 @@ struct kvm_irq_routing {
 };
 
 #endif
+#if defined(CONFIG_X86)
+#define KVM_CAP_DEVICE_MSIX 26
+#endif
 
 /*
  * ioctls for VM fds
@@ -597,6 +600,11 @@ struct kvm_assigned_irq {
 #define KVM_DEV_IRQ_ASSIGN_MSI_ACTION  KVM_DEV_IRQ_ASSIGN_ENABLE_MSI
 #define KVM_DEV_IRQ_ASSIGN_ENABLE_MSI  (1  0)
 
+#define KVM_DEV_IRQ_ASSIGN_MSIX_ACTION  (KVM_DEV_IRQ_ASSIGN_ENABLE_MSIX |\
+   KVM_DEV_IRQ_ASSIGN_MASK_MSIX)
+#define KVM_DEV_IRQ_ASSIGN_ENABLE_MSIX  (1  1)
+#define KVM_DEV_IRQ_ASSIGN_MASK_MSIX(1  2)
+
 struct kvm_assigned_msix_nr {
__u32 assigned_dev_id;
__u16 entry_nr;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 4010802..d3acb37 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -280,13 +280,33 @@ static void kvm_free_assigned_irq(struct kvm *kvm,
 * now, the kvm state is still legal for probably we also have to wait
 * interrupt_work done.
 */
-   disable_irq_nosync(assigned_dev-host_irq);
-   cancel_work_sync(assigned_dev-interrupt_work);
+   if (assigned_dev-irq_requested_type  KVM_ASSIGNED_DEV_MSIX) {
+   int i;
+   for (i = 0; i  assigned_dev-entries_nr; i++)
+   disable_irq_nosync(assigned_dev-
+  host_msix_entries[i].vector);
+
+   cancel_work_sync(assigned_dev-interrupt_work);
+
+   for (i = 0; i  assigned_dev-entries_nr; i++)
+   free_irq(assigned_dev-host_msix_entries[i].vector,
+(void *)assigned_dev);
+
+   assigned_dev-entries_nr = 0;
+   kfree(assigned_dev-host_msix_entries);
+   kfree(assigned_dev-guest_msix_entries);
+   pci_disable_msix(assigned_dev-dev);
+   } else {
+   /* Deal with MSI and INTx */
+   disable_irq_nosync(assigned_dev-host_irq);
+   cancel_work_sync(assigned_dev-interrupt_work);
 
-   free_irq(assigned_dev-host_irq, (void *)assigned_dev);
+   free_irq(assigned_dev-host_irq, (void *)assigned_dev);
 
-   if (assigned_dev-irq_requested_type  KVM_ASSIGNED_DEV_HOST_MSI)
-   pci_disable_msi(assigned_dev-dev);
+   if (assigned_dev-irq_requested_type 
+   KVM_ASSIGNED_DEV_HOST_MSI)
+   pci_disable_msi(assigned_dev-dev);
+   }
 
assigned_dev-irq_requested_type = 0;
 }
@@ -415,6 +435,69 @@ static int assigned_device_update_msi(struct kvm *kvm,
adev-irq_requested_type |= KVM_ASSIGNED_DEV_HOST_MSI;
return 0;
 }
+
+static int assigned_device_update_msix(struct kvm *kvm,
+   struct kvm_assigned_dev_kernel *adev,
+   struct kvm_assigned_irq *airq)
+{
+   /* TODO Deal with KVM_DEV_IRQ_ASSIGNED_MASK_MSIX */
+   int i, r;
+
+   adev-ack_notifier.gsi = -1;
+
+   if (irqchip_in_kernel(kvm)) {
+   if (airq-flags  KVM_DEV_IRQ_ASSIGN_MASK_MSIX) {
+   printk(KERN_WARNING
+  kvm: unsupported mask MSI-X, flags 0x%x!\n,
+  airq-flags);
+   return 0;
+

[PATCH 2/3] KVM: Add gsi_msg_pending_bitmap for MSI-X

2009-02-18 Thread Sheng Yang
We have to handle more than one interrupt with one handler for MSI-X. So we
need a bitmap to track the triggered interrupts.

Signed-off-by: Sheng Yang sh...@linux.intel.com
---
 include/linux/kvm_host.h |5 +-
 virt/kvm/kvm_main.c  |  102 -
 2 files changed, 102 insertions(+), 5 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index b105ada..6e354af 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -145,6 +145,8 @@ struct kvm {
 #ifdef CONFIG_HAVE_KVM_IRQCHIP
struct list_head irq_routing; /* of kvm_kernel_irq_routing_entry */
struct hlist_head mask_notifier_list;
+#define KVM_MAX_IRQ_ROUTES 1024
+   DECLARE_BITMAP(irq_routes_pending_bitmap, KVM_MAX_IRQ_ROUTES);
 #endif
 
 #ifdef KVM_ARCH_WANT_MMU_NOTIFIER
@@ -336,6 +338,7 @@ struct kvm_assigned_dev_kernel {
 #define KVM_ASSIGNED_DEV_GUEST_MSI (1  1)
 #define KVM_ASSIGNED_DEV_HOST_INTX (1  8)
 #define KVM_ASSIGNED_DEV_HOST_MSI  (1  9)
+#define KVM_ASSIGNED_DEV_MSIX  ((1  2) | (1  10))
unsigned long irq_requested_type;
int irq_source_id;
int flags;
@@ -503,8 +506,6 @@ static inline int mmu_notifier_retry(struct kvm_vcpu *vcpu, 
unsigned long mmu_se
 
 #ifdef CONFIG_HAVE_KVM_IRQCHIP
 
-#define KVM_MAX_IRQ_ROUTES 1024
-
 int kvm_setup_default_irq_routing(struct kvm *kvm);
 int kvm_set_irq_routing(struct kvm *kvm,
const struct kvm_irq_routing_entry *entries,
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index bb4aa73..4010802 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -95,25 +95,113 @@ static struct kvm_assigned_dev_kernel 
*kvm_find_assigned_dev(struct list_head *h
return NULL;
 }
 
+static int find_host_irq_from_gsi(struct kvm_assigned_dev_kernel *assigned_dev,
+ u32 gsi)
+{
+   int i, entry, irq;
+   struct msix_entry *host_msix_entries, *guest_msix_entries;
+
+   host_msix_entries = assigned_dev-host_msix_entries;
+   guest_msix_entries = assigned_dev-guest_msix_entries;
+
+   entry = -1;
+   irq = 0;
+   for (i = 0; i  assigned_dev-entries_nr; i++)
+   if (gsi == (guest_msix_entries + i)-vector) {
+   entry = (guest_msix_entries + i)-entry;
+   break;
+   }
+   if (entry  0) {
+   printk(KERN_WARNING Fail to find correlated MSI-X entry!\n);
+   return 0;
+   }
+   for (i = 0; i  assigned_dev-entries_nr; i++)
+   if (entry == (host_msix_entries + i)-entry) {
+   irq = (host_msix_entries + i)-vector;
+   break;
+   }
+   if (irq == 0) {
+   printk(KERN_WARNING Fail to find correlated MSI-X irq!\n);
+   return 0;
+   }
+
+   return irq;
+}
+
+static int find_gsi_from_host_irq(struct kvm_assigned_dev_kernel *assigned_dev,
+ int irq)
+{
+   int i, entry, gsi;
+   struct msix_entry *host_msix_entries, *guest_msix_entries;
+
+   host_msix_entries = assigned_dev-host_msix_entries;
+   guest_msix_entries = assigned_dev-guest_msix_entries;
+
+   entry = -1;
+   gsi = -1;
+   for (i = 0; i  assigned_dev-entries_nr; i++)
+   if (irq == (host_msix_entries + i)-vector) {
+   entry = (host_msix_entries + i)-entry;
+   break;
+   }
+   if (entry  0) {
+   printk(KERN_WARNING Fail to find correlated MSI-X entry!\n);
+   return 0;
+   }
+   for (i = 0; i  assigned_dev-entries_nr; i++)
+   if (entry == (guest_msix_entries + i)-entry) {
+   gsi = (guest_msix_entries + i)-vector;
+   break;
+   }
+   if (gsi  0) {
+   printk(KERN_WARNING Fail to find correlated MSI-X gsi!\n);
+   return 0;
+   }
+
+   return gsi;
+}
+
 static void kvm_assigned_dev_interrupt_work_handler(struct work_struct *work)
 {
struct kvm_assigned_dev_kernel *assigned_dev;
+   struct kvm *kvm;
+   u32 gsi;
+   int irq;
 
assigned_dev = container_of(work, struct kvm_assigned_dev_kernel,
interrupt_work);
+   kvm = assigned_dev-kvm;
 
/* This is taken to safely inject irq inside the guest. When
 * the interrupt injection (or the ioapic code) uses a
 * finer-grained lock, update this
 */
-   mutex_lock(assigned_dev-kvm-lock);
-   kvm_set_irq(assigned_dev-kvm, assigned_dev-irq_source_id,
-   assigned_dev-guest_irq, 1);
+   mutex_lock(kvm-lock);
+handle_irq:
+   if (assigned_dev-irq_requested_type  KVM_ASSIGNED_DEV_MSIX) {
+   gsi = find_first_bit(kvm-irq_routes_pending_bitmap,
+KVM_MAX_IRQ_ROUTES);
+ 

[PATCH 1/3] KVM: Ioctls for init MSI-X entry

2009-02-18 Thread Sheng Yang
Introduce KVM_SET_MSIX_NR and KVM_SET_MSIX_ENTRY two ioctls.

This two ioctls are used by userspace to specific guest device MSI-X entry
number and correlate MSI-X entry with GSI during the initialization stage.

MSI-X should be well initialzed before enabling.

Don't support change MSI-X entry number for now.

Signed-off-by: Sheng Yang sh...@linux.intel.com
---
 include/linux/kvm.h  |   16 +++
 include/linux/kvm_host.h |3 +
 virt/kvm/kvm_main.c  |  103 ++
 3 files changed, 122 insertions(+), 0 deletions(-)

diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 2163b3d..a2dfbe0 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -475,6 +475,8 @@ struct kvm_irq_routing {
 #define KVM_ASSIGN_IRQ _IOR(KVMIO, 0x70, \
struct kvm_assigned_irq)
 #define KVM_REINJECT_CONTROL  _IO(KVMIO, 0x71)
+#define KVM_SET_MSIX_NR _IOR(KVMIO, 0x72, struct kvm_assigned_msix_nr)
+#define KVM_SET_MSIX_ENTRY _IOR(KVMIO, 0x73, struct kvm_assigned_msix_entry)
 
 /*
  * ioctls for vcpu fds
@@ -595,4 +597,18 @@ struct kvm_assigned_irq {
 #define KVM_DEV_IRQ_ASSIGN_MSI_ACTION  KVM_DEV_IRQ_ASSIGN_ENABLE_MSI
 #define KVM_DEV_IRQ_ASSIGN_ENABLE_MSI  (1  0)
 
+struct kvm_assigned_msix_nr {
+   __u32 assigned_dev_id;
+   __u16 entry_nr;
+   __u16 padding;
+};
+
+#define KVM_MAX_MSIX_PER_DEV   512
+struct kvm_assigned_msix_entry {
+   __u32 assigned_dev_id;
+   __u32 gsi;
+   __u16 entry; /* The index of entry in the MSI-X table */
+   __u16 padding[3];
+};
+
 #endif
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 7c7096d..b105ada 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -326,9 +326,12 @@ struct kvm_assigned_dev_kernel {
int assigned_dev_id;
int host_busnr;
int host_devfn;
+   unsigned int entries_nr;
int host_irq;
bool host_irq_disabled;
+   struct msix_entry *host_msix_entries;
int guest_irq;
+   struct msix_entry *guest_msix_entries;
 #define KVM_ASSIGNED_DEV_GUEST_INTX(1  0)
 #define KVM_ASSIGNED_DEV_GUEST_MSI (1  1)
 #define KVM_ASSIGNED_DEV_HOST_INTX (1  8)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 266bdaf..bb4aa73 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1593,6 +1593,87 @@ static int kvm_vcpu_ioctl_set_sigmask(struct kvm_vcpu 
*vcpu, sigset_t *sigset)
return 0;
 }
 
+static int kvm_vm_ioctl_set_msix_nr(struct kvm *kvm,
+   struct kvm_assigned_msix_nr *entry_nr)
+{
+   int r = 0;
+   struct kvm_assigned_dev_kernel *adev;
+
+   mutex_lock(kvm-lock);
+
+   adev = kvm_find_assigned_dev(kvm-arch.assigned_dev_head,
+ entry_nr-assigned_dev_id);
+   if (!adev) {
+   r = -EINVAL;
+   goto msix_nr_out;
+   }
+
+   if (adev-entries_nr == 0) {
+   adev-entries_nr = entry_nr-entry_nr;
+   if (adev-entries_nr == 0 ||
+   adev-entries_nr = KVM_MAX_MSIX_PER_DEV)
+   goto msix_nr_out;
+
+   adev-host_msix_entries = kzalloc(sizeof(struct msix_entry) *
+   entry_nr-entry_nr,
+   GFP_KERNEL);
+   if (!adev-host_msix_entries) {
+   printk(KERN_ERR no memory for host msix entries!\n);
+   r = -ENOMEM;
+   goto msix_nr_out;
+   }
+   adev-guest_msix_entries = kzalloc(sizeof(struct msix_entry) *
+   entry_nr-entry_nr,
+   GFP_KERNEL);
+   if (!adev-guest_msix_entries) {
+   printk(KERN_ERR no memory for host msix entries!\n);
+   kfree(adev-host_msix_entries);
+   r = -ENOMEM;
+   goto msix_nr_out;
+   }
+   } else
+   printk(KERN_WARNING kvm: not allow recall set msix nr!\n);
+msix_nr_out:
+   mutex_unlock(kvm-lock);
+   return r;
+}
+
+static int kvm_vm_ioctl_set_msix_entry(struct kvm *kvm,
+  struct kvm_assigned_msix_entry *entry)
+{
+   int r = 0, i;
+   struct kvm_assigned_dev_kernel *adev;
+
+   mutex_lock(kvm-lock);
+
+   adev = kvm_find_assigned_dev(kvm-arch.assigned_dev_head,
+ entry-assigned_dev_id);
+
+   if (!adev) {
+   r = -EINVAL;
+   goto msix_entry_out;
+   }
+
+   for (i = 0; i  adev-entries_nr; i++)
+   if (adev-guest_msix_entries[i].vector == 0 ||
+   adev-guest_msix_entries[i].entry == entry-entry) {
+   adev-guest_msix_entries[i].entry = entry-entry;
+   

Re: Current KVM head crashes on startup

2009-02-18 Thread Avi Kivity

Amit Shah wrote:

 

The first version generates too much register pressure for some  
compilers on i386, leading to compilation failures.  The second version  



Is it still valid? I tried with gcc-4.1.2 and that worked fine with the
first version. Should we just use that version instead?

  


I don't see why it would change, unless you can destroy all copies of 
the compilers that fail with it.



--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] KVM: Ioctls for init MSI-X entry

2009-02-18 Thread Avi Kivity

Sheng Yang wrote:

Introduce KVM_SET_MSIX_NR and KVM_SET_MSIX_ENTRY two ioctls.

This two ioctls are used by userspace to specific guest device MSI-X entry
number and correlate MSI-X entry with GSI during the initialization stage.

MSI-X should be well initialzed before enabling.

Don't support change MSI-X entry number for now.

  


Sorry, this has been reviewed quite a bit but I found a few issues:


diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 2163b3d..a2dfbe0 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -475,6 +475,8 @@ struct kvm_irq_routing {
 #define KVM_ASSIGN_IRQ _IOR(KVMIO, 0x70, \
struct kvm_assigned_irq)
 #define KVM_REINJECT_CONTROL  _IO(KVMIO, 0x71)
+#define KVM_SET_MSIX_NR _IOR(KVMIO, 0x72, struct kvm_assigned_msix_nr)
+#define KVM_SET_MSIX_ENTRY _IOR(KVMIO, 0x73, struct kvm_assigned_msix_entry)
  


KVM_SET_ASSIGNED_... so it's associated with device assignment, not generic.

Should be _IOW, not _IOR.  Looks like KVM_ASSIGN_IRQ is broken...

 
+static int kvm_vm_ioctl_set_msix_nr(struct kvm *kvm,

+   struct kvm_assigned_msix_nr *entry_nr)
+{
+   int r = 0;
+   struct kvm_assigned_dev_kernel *adev;
+
+   mutex_lock(kvm-lock);
+
+   adev = kvm_find_assigned_dev(kvm-arch.assigned_dev_head,
+ entry_nr-assigned_dev_id);
+   if (!adev) {
+   r = -EINVAL;
+   goto msix_nr_out;
+   }
+
+   if (adev-entries_nr == 0) {
+   adev-entries_nr = entry_nr-entry_nr;
+   if (adev-entries_nr == 0 ||
+   adev-entries_nr = KVM_MAX_MSIX_PER_DEV)
+   goto msix_nr_out;
  


r == 0 here, needs a meaningful error number.


+
+   adev-host_msix_entries = kzalloc(sizeof(struct msix_entry) *
+   entry_nr-entry_nr,
+   GFP_KERNEL);
+   if (!adev-host_msix_entries) {
+   printk(KERN_ERR no memory for host msix entries!\n);
+   r = -ENOMEM;
  


Drop the printk, -ENOMEM is enough.


+   goto msix_nr_out;
+   }
+   adev-guest_msix_entries = kzalloc(sizeof(struct msix_entry) *
+   entry_nr-entry_nr,
+   GFP_KERNEL);
+   if (!adev-guest_msix_entries) {
+   printk(KERN_ERR no memory for host msix entries!\n);
  


Ditto.


+   kfree(adev-host_msix_entries);
+   r = -ENOMEM;
+   goto msix_nr_out;
+   }
+   } else
+   printk(KERN_WARNING kvm: not allow recall set msix nr!\n);
  


Drop printk, add error.


+msix_nr_out:
+   mutex_unlock(kvm-lock);
+   return r;
+}
+
+static int kvm_vm_ioctl_set_msix_entry(struct kvm *kvm,
+  struct kvm_assigned_msix_entry *entry)
+{
+   int r = 0, i;
+   struct kvm_assigned_dev_kernel *adev;
+
+   mutex_lock(kvm-lock);
+
+   adev = kvm_find_assigned_dev(kvm-arch.assigned_dev_head,
+ entry-assigned_dev_id);
+
+   if (!adev) {
+   r = -EINVAL;
+   goto msix_entry_out;
+   }
+
+   for (i = 0; i  adev-entries_nr; i++)
+   if (adev-guest_msix_entries[i].vector == 0 ||
+   adev-guest_msix_entries[i].entry == entry-entry) {
+   adev-guest_msix_entries[i].entry = entry-entry;
+   adev-guest_msix_entries[i].vector = entry-gsi;
+   adev-host_msix_entries[i].entry = entry-entry;
+   break;
+   }
+   if (i == adev-entries_nr) {
+   printk(KERN_ERR kvm: Too much entries for MSI-X!\n);
  


Drop.


+   r = -ENOSPC;
+   goto msix_entry_out;
+   }
+
+msix_entry_out:
+   mutex_unlock(kvm-lock);
+
+   return r;
+}
+
 static long kvm_vcpu_ioctl(struct file *filp,
   unsigned int ioctl, unsigned long arg)
 {
@@ -1917,7 +1998,29 @@ static long kvm_vm_ioctl(struct file *filp,
vfree(entries);
break;
}
+#ifdef KVM_CAP_DEVICE_MSIX
+   case KVM_SET_MSIX_NR: {
+   struct kvm_assigned_msix_nr entry_nr;
+   r = -EFAULT;
+   if (copy_from_user(entry_nr, argp, sizeof entry_nr))
+   goto out;
+   r = kvm_vm_ioctl_set_msix_nr(kvm, entry_nr);
+   if (r)
+   goto out;
+   break;
+   }
+   case KVM_SET_MSIX_ENTRY: {
+   struct kvm_assigned_msix_entry entry;
+   r = -EFAULT;
+   if (copy_from_user(entry, argp, sizeof entry))
+   goto out;
+  

Re: [PATCH 3/3] KVM: Enable MSI-X for KVM assigned device

2009-02-18 Thread Avi Kivity

Sheng Yang wrote:

This patch finally enable MSI-X.

What we need for MSI-X:
1. Intercept one page in MMIO region of device. So that we can get guest desired
MSI-X table and set up the real one. Now this have been done by guest, and
transfer to kernel using ioctl KVM_SET_MSIX_NR and KVM_SET_MSIX_ENTRY.

2. Information for incoming interrupt. Now one device can have more than one
interrupt, and they are all handled by one workqueue structure. So we need to
identify them. The previous patch enable gsi_msg_pending_bitmap get this done.

3. Mapping from host IRQ to guest gsi as well as guest gsi to real MSI/MSI-X
message address/data. We used same entry number for the host and guest here, so
that it's easy to find the correlated guest gsi.

What we lack for now:
1. The PCI spec said nothing can existed with MSI-X table in the same page of
MMIO region, except pending bits. The patch ignore pending bits as the first
step (so they are always 0 - no pending).

2. The PCI spec allowed to change MSI-X table dynamically. That means, the OS
can enable MSI-X, then mask one MSI-X entry, modify it, and unmask it. The patch
didn't support this, and Linux also don't work in this way.

3. The patch didn't implement MSI-X mask all and mask single entry. I would
implement the former in driver/pci/msi.c later. And for single entry, userspace
should have reposibility to handle it.

Signed-off-by: Sheng Yang sh...@linux.intel.com
---
 include/linux/kvm.h |8 
 virt/kvm/kvm_main.c |  107 ---
 2 files changed, 109 insertions(+), 6 deletions(-)

diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index a2dfbe0..78480d0 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -440,6 +440,9 @@ struct kvm_irq_routing {
 };
 
 #endif

+#if defined(CONFIG_X86)
+#define KVM_CAP_DEVICE_MSIX 26
+#endif
  


We switched to a different way of depending on CONFIG_X86, see the other 
KVM_CAP defines.


 
 struct kvm_assigned_msix_nr {

__u32 assigned_dev_id;
__u16 entry_nr;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 4010802..d3acb37 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -280,13 +280,33 @@ static void kvm_free_assigned_irq(struct kvm *kvm,
 * now, the kvm state is still legal for probably we also have to wait
 * interrupt_work done.
 */
-   disable_irq_nosync(assigned_dev-host_irq);
-   cancel_work_sync(assigned_dev-interrupt_work);
+   if (assigned_dev-irq_requested_type  KVM_ASSIGNED_DEV_MSIX) {
+   int i;
+   for (i = 0; i  assigned_dev-entries_nr; i++)
+   disable_irq_nosync(assigned_dev-
+  host_msix_entries[i].vector);
+
+   cancel_work_sync(assigned_dev-interrupt_work);
+
+   for (i = 0; i  assigned_dev-entries_nr; i++)
+   free_irq(assigned_dev-host_msix_entries[i].vector,
+(void *)assigned_dev);
+
+   assigned_dev-entries_nr = 0;
+   kfree(assigned_dev-host_msix_entries);
+   kfree(assigned_dev-guest_msix_entries);
+   pci_disable_msix(assigned_dev-dev);
+   } else {
+   /* Deal with MSI and INTx */
+   disable_irq_nosync(assigned_dev-host_irq);
+   cancel_work_sync(assigned_dev-interrupt_work);
  


How about always have an array?  That will also allow us to deal with 
INTx where x=B,C,D.


Currently for MSI and INTx the array will hold just one active element.

 
-	free_irq(assigned_dev-host_irq, (void *)assigned_dev);

+   free_irq(assigned_dev-host_irq, (void *)assigned_dev);
 
-	if (assigned_dev-irq_requested_type  KVM_ASSIGNED_DEV_HOST_MSI)

-   pci_disable_msi(assigned_dev-dev);
+   if (assigned_dev-irq_requested_type 
+   KVM_ASSIGNED_DEV_HOST_MSI)
+   pci_disable_msi(assigned_dev-dev);
+   }
  


All those flags and bits are worrying me.  Maybe each entry in the array 
can have an ops member, and disabling would work by calling 
-ops-disable().


We can do that later.

 
 	assigned_dev-irq_requested_type = 0;

 }
@@ -415,6 +435,69 @@ static int assigned_device_update_msi(struct kvm *kvm,
adev-irq_requested_type |= KVM_ASSIGNED_DEV_HOST_MSI;
return 0;
 }
+
+static int assigned_device_update_msix(struct kvm *kvm,
+   struct kvm_assigned_dev_kernel *adev,
+   struct kvm_assigned_irq *airq)
+{
+   /* TODO Deal with KVM_DEV_IRQ_ASSIGNED_MASK_MSIX */
+   int i, r;
+
+   adev-ack_notifier.gsi = -1;
+
+   if (irqchip_in_kernel(kvm)) {
+   if (airq-flags  KVM_DEV_IRQ_ASSIGN_MASK_MSIX) {
+   printk(KERN_WARNING
+  kvm: unsupported mask MSI-X, flags 0x%x!\n,
+  airq-flags);
  


error, not 

Re: [PATCH 2/3] KVM: Add gsi_msg_pending_bitmap for MSI-X

2009-02-18 Thread Avi Kivity

Sheng Yang wrote:

We have to handle more than one interrupt with one handler for MSI-X. So we
need a bitmap to track the triggered interrupts.
  


Can you explain why?

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] KVM: Add gsi_msg_pending_bitmap for MSI-X

2009-02-18 Thread Sheng Yang
On Wednesday 18 February 2009 19:00:53 Avi Kivity wrote:
 Sheng Yang wrote:
  We have to handle more than one interrupt with one handler for MSI-X. So
  we need a bitmap to track the triggered interrupts.

 Can you explain why?

Or how can we know which interrupt happened? Current we scheduled the work 
later, and no more irq information available at that time.

-- 
regards
Yang, Sheng

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Current KVM head crashes on startup

2009-02-18 Thread Amit Shah
On (Wed) Feb 18 2009 [10:19:44], Avi Kivity wrote:
 Amit Shah wrote:

  

 The first version generates too much register pressure for some   
 compilers on i386, leading to compilation failures.  The second 
 version  

 Is it still valid? I tried with gcc-4.1.2 and that worked fine with the
 first version. Should we just use that version instead?

   

 I don't see why it would change, unless you can destroy all copies of  
 the compilers that fail with it.

I'd like to know which compilers fail to compile it -- maintaining
specific code can introduce such regressions.

qemu too doesn't have a dependency on gcc-3 anymore.

Also, softwares do periodically bump up the minimum required versions of
their dependencies.

Amit
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Current KVM head crashes on startup

2009-02-18 Thread Avi Kivity

Amit Shah wrote:
  
  
I don't see why it would change, unless you can destroy all copies of  
the compilers that fail with it.



I'd like to know which compilers fail to compile it


I don't recall, it probably depends on whether frame pointers are used 
or not as well.



 -- maintaining
specific code can introduce such regressions.
  


That's a problem with assembly.  x86 and x86_64 are different 
instruction sets.



qemu too doesn't have a dependency on gcc-3 anymore.
  


We aren't forcing users to use gcc 4.


Also, softwares do periodically bump up the minimum required versions of
their dependencies.


Not for this kind of bug.

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] KVM: Add gsi_msg_pending_bitmap for MSI-X

2009-02-18 Thread Avi Kivity

Sheng Yang wrote:

On Wednesday 18 February 2009 19:00:53 Avi Kivity wrote:
  

Sheng Yang wrote:


We have to handle more than one interrupt with one handler for MSI-X. So
we need a bitmap to track the triggered interrupts.
  

Can you explain why?



Or how can we know which interrupt happened? Current we scheduled the work 
later, and no more irq information available at that time.
  


We can have a work_struct per interrupt, or we can set a flag in the 
msix array that the interrupt is pending.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] KVM: Add gsi_msg_pending_bitmap for MSI-X

2009-02-18 Thread Sheng Yang
On Wednesday 18 February 2009 19:29:28 Avi Kivity wrote:
 Sheng Yang wrote:
  On Wednesday 18 February 2009 19:00:53 Avi Kivity wrote:
  Sheng Yang wrote:
  We have to handle more than one interrupt with one handler for MSI-X.
  So we need a bitmap to track the triggered interrupts.
 
  Can you explain why?
 
  Or how can we know which interrupt happened? Current we scheduled the
  work later, and no more irq information available at that time.

 We can have a work_struct per interrupt, or we can set a flag in the
 msix array that the interrupt is pending.

As I know, work_struct itself don't take any data. And host MSI-X array is a 
type of msix_entry* which is used for pci_enable_msix. But modifying type of 
guest msix entries should be OK. I will try.

-- 
regards
Yang, Sheng
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: copyless virtio net thoughts?

2009-02-18 Thread Herbert Xu
On Wed, Feb 18, 2009 at 10:08:00PM +1030, Rusty Russell wrote:

 4) Multiple queues
 This is Herbert's.  Should be fairly simple to add; it was in the back of my 
 mind when we started.  Not sure whether the queues should be static or 
 dynamic (imagine direct interguest networking, one queue pair for each other 
 guest), and how xmit queues would be selected by the guest (anything 
 anywhere, or dst mac?).

The primary purpose of multiple queues is to maximise CPU utilisation,
so the number of queues is simply dependent on the number of CPUs
allotted to the guest.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} herb...@gondor.apana.org.au
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Current KVM head crashes on startup

2009-02-18 Thread Amit Shah
On (Wed) Feb 18 2009 [11:26:42], Avi Kivity wrote:
 Amit Shah wrote:
 
 I don't see why it would change, unless you can destroy all copies of 
  the compilers that fail with it.
 

 I'd like to know which compilers fail to compile it

 I don't recall, it probably depends on whether frame pointers are used  
 or not as well.

As far as I know, kvm-userspace build arguments have remained the same
for quite some time. Also, we still pass the -g flag for userspace
compilations.

  -- maintaining
 specific code can introduce such regressions.
   

 That's a problem with assembly.  x86 and x86_64 are different  
 instruction sets.

But the code in question isn't different on the two architectures. Just
a cpuid call that hasn't changed.

 qemu too doesn't have a dependency on gcc-3 anymore.
   

 We aren't forcing users to use gcc 4.

 Also, softwares do periodically bump up the minimum required versions of
 their dependencies.

 Not for this kind of bug.

Just enumerating why just destroying all the copies isn't the only option
:-)

OK, given a patch to have just one version of the cpuid call, would you
be willing to take the risk of finding out which users it breaks for?
I'll send patches to revert and restore correct behaviour for 32-bit if
that does happen.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] KVM: Enable MSI-X for KVM assigned device

2009-02-18 Thread Sheng Yang
On Wed, Feb 18, 2009 at 10:45:19AM +, Avi Kivity wrote:
 Sheng Yang wrote:
 index a2dfbe0..78480d0 100644
 --- a/include/linux/kvm.h
 +++ b/include/linux/kvm.h
 @@ -440,6 +440,9 @@ struct kvm_irq_routing {
  };
   #endif
 +#if defined(CONFIG_X86)
 +#define KVM_CAP_DEVICE_MSIX 26
 +#endif
   

 We switched to a different way of depending on CONFIG_X86, see the other  
 KVM_CAP defines.

Thanks to point it out. :)


   struct kvm_assigned_msix_nr {
  __u32 assigned_dev_id;
  __u16 entry_nr;
 diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
 index 4010802..d3acb37 100644
 --- a/virt/kvm/kvm_main.c
 +++ b/virt/kvm/kvm_main.c
 @@ -280,13 +280,33 @@ static void kvm_free_assigned_irq(struct kvm *kvm,
   * now, the kvm state is still legal for probably we also have to wait
   * interrupt_work done.
   */
 -disable_irq_nosync(assigned_dev-host_irq);
 -cancel_work_sync(assigned_dev-interrupt_work);
 +if (assigned_dev-irq_requested_type  KVM_ASSIGNED_DEV_MSIX) {
 +int i;
 +for (i = 0; i  assigned_dev-entries_nr; i++)
 +disable_irq_nosync(assigned_dev-
 +   host_msix_entries[i].vector);
 +
 +cancel_work_sync(assigned_dev-interrupt_work);
 +
 +for (i = 0; i  assigned_dev-entries_nr; i++)
 +free_irq(assigned_dev-host_msix_entries[i].vector,
 + (void *)assigned_dev);
 +
 +assigned_dev-entries_nr = 0;
 +kfree(assigned_dev-host_msix_entries);
 +kfree(assigned_dev-guest_msix_entries);
 +pci_disable_msix(assigned_dev-dev);
 +} else {
 +/* Deal with MSI and INTx */
 +disable_irq_nosync(assigned_dev-host_irq);
 +cancel_work_sync(assigned_dev-interrupt_work);
   

 How about always have an array?  That will also allow us to deal with  
 INTx where x=B,C,D.

 Currently for MSI and INTx the array will hold just one active element.

So array, or bitmap? I remember I changed it to bitmap accounding to your
first comment...

OK. I think array is reasonable, but the length is a problem, as I did before. 
How long would you like?

-- 
regards
Yang, Sheng
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Current KVM head crashes on startup

2009-02-18 Thread Avi Kivity

Amit Shah wrote:
I don't recall, it probably depends on whether frame pointers are used  
or not as well.



As far as I know, kvm-userspace build arguments have remained the same
for quite some time. Also, we still pass the -g flag for userspace
compilations.

  


Some distros change CFLAGS.


 -- maintaining
specific code can introduce such regressions.
  
  
That's a problem with assembly.  x86 and x86_64 are different  
instruction sets.



But the code in question isn't different on the two architectures. Just
a cpuid call that hasn't changed.

  


The amount of available registers is different, as is the specification 
of which registers may be clobbered.



OK, given a patch to have just one version of the cpuid call, would you
be willing to take the risk of finding out which users it breaks for?
I'll send patches to revert and restore correct behaviour for 32-bit if
that does happen.
  


This is in upstream qemu so it's not my call but I wouldn't recommend 
breaking the build when a trivial one liner is possible.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] KVM: Enable MSI-X for KVM assigned device

2009-02-18 Thread Avi Kivity

Sheng Yang wrote:



  struct kvm_assigned_msix_nr {
__u32 assigned_dev_id;
__u16 entry_nr;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 4010802..d3acb37 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -280,13 +280,33 @@ static void kvm_free_assigned_irq(struct kvm *kvm,
 * now, the kvm state is still legal for probably we also have to wait
 * interrupt_work done.
 */
-   disable_irq_nosync(assigned_dev-host_irq);
-   cancel_work_sync(assigned_dev-interrupt_work);
+   if (assigned_dev-irq_requested_type  KVM_ASSIGNED_DEV_MSIX) {
+   int i;
+   for (i = 0; i  assigned_dev-entries_nr; i++)
+   disable_irq_nosync(assigned_dev-
+  host_msix_entries[i].vector);
+
+   cancel_work_sync(assigned_dev-interrupt_work);
+
+   for (i = 0; i  assigned_dev-entries_nr; i++)
+   free_irq(assigned_dev-host_msix_entries[i].vector,
+(void *)assigned_dev);
+
+   assigned_dev-entries_nr = 0;
+   kfree(assigned_dev-host_msix_entries);
+   kfree(assigned_dev-guest_msix_entries);
+   pci_disable_msix(assigned_dev-dev);
+   } else {
+   /* Deal with MSI and INTx */
+   disable_irq_nosync(assigned_dev-host_irq);
+   cancel_work_sync(assigned_dev-interrupt_work);
  
  
How about always have an array?  That will also allow us to deal with  
INTx where x=B,C,D.


Currently for MSI and INTx the array will hold just one active element.



So array, or bitmap? I remember I changed it to bitmap accounding to your
first comment...
  


Which bitmap?  I'm confused.

I'm talking about unifying the existing array 
(assigned_dev-host_msix_entries[]) with -host_irq.  Also since we need 
an array for INTx when a function uses INT[BCD].


So we'll have assigned_dev-host_irqs[], each entry can be INTx or MSI 
or MSIx.



OK. I think array is reasonable, but the length is a problem, as I did before. 
How long would you like?
  


MAX(4, KVM_MAX_MSIX_ENTRIES), no?

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/7] [V3] kvm: qemu: fix hot remove assigned device with iommu

2009-02-18 Thread Avi Kivity

Han, Weidong wrote:

device assignment hotplug doesn't work on current tree. I had a quick glance, 
found device assignment didn't be considered, qemu_system_hot_assign_device is 
not used at all. I will fix it.
  


I noticed that during the merge, but wasn't familiar in the code to fix 
it myself.  Thanks for taking care of it.


We should work to merge the code in upstream qemu so that this doesn't 
happen again.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recent kvm and vmware server comparisons?

2009-02-18 Thread Thomas Fjellstrom
On Tuesday 10 February 2009, Thomas Fjellstrom wrote:
 I've temporarily got vmware server running on my new server, and intend
 to migrate over to kvm as soon as possible, if it provides enough incentive
 (extra performance, features). Currently I'm waiting for full iommu support
 in the kernel, modules and userspace, and didn't plan to migrate till I had
 hardware that could do iommu, kvm fully supported iommu + DMA for devices
 passed through, could also pass through more than one device per guest (I
 saw hints that the intel iommu implementation can only do one device per
 guest? please tell me I'm wrong, it seems like an odd design choice to
 make), and full migration.

 But if I can get enough performance over vmware server 2 with plain old kvm
 + virtio, I'd happily migrate.

 I saw a message late last year comparing the two, but I know how quickly
 things change in the OSS world, and I also intend to use raw devices
 (possibly AoE) for guest disks (not qcow or anything like it), and virtio
 for networking.

 So has anyone tested the two lately? Got any experiences you'd like to
 share?

I suppose no-one has any?

-- 
Thomas Fjellstrom
tfjellst...@shaw.ca
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] KVM: Enable MSI-X for KVM assigned device

2009-02-18 Thread Sheng Yang
On Wednesday 18 February 2009 20:36:10 Avi Kivity wrote:
 Sheng Yang wrote:
struct kvm_assigned_msix_nr {
__u32 assigned_dev_id;
__u16 entry_nr;
  diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
  index 4010802..d3acb37 100644
  --- a/virt/kvm/kvm_main.c
  +++ b/virt/kvm/kvm_main.c
  @@ -280,13 +280,33 @@ static void kvm_free_assigned_irq(struct kvm
  *kvm, * now, the kvm state is still legal for probably we also have to
  wait * interrupt_work done.
 */
  - disable_irq_nosync(assigned_dev-host_irq);
  - cancel_work_sync(assigned_dev-interrupt_work);
  + if (assigned_dev-irq_requested_type  KVM_ASSIGNED_DEV_MSIX) {
  + int i;
  + for (i = 0; i  assigned_dev-entries_nr; i++)
  + disable_irq_nosync(assigned_dev-
  +host_msix_entries[i].vector);
  +
  + cancel_work_sync(assigned_dev-interrupt_work);
  +
  + for (i = 0; i  assigned_dev-entries_nr; i++)
  + free_irq(assigned_dev-host_msix_entries[i].vector,
  +  (void *)assigned_dev);
  +
  + assigned_dev-entries_nr = 0;
  + kfree(assigned_dev-host_msix_entries);
  + kfree(assigned_dev-guest_msix_entries);
  + pci_disable_msix(assigned_dev-dev);
  + } else {
  + /* Deal with MSI and INTx */
  + disable_irq_nosync(assigned_dev-host_irq);
  + cancel_work_sync(assigned_dev-interrupt_work);
 
  How about always have an array?  That will also allow us to deal with
  INTx where x=B,C,D.
 
  Currently for MSI and INTx the array will hold just one active element.
 
  So array, or bitmap? I remember I changed it to bitmap accounding to your
  first comment...

 Which bitmap?  I'm confused.

 I'm talking about unifying the existing array
 (assigned_dev-host_msix_entries[]) with -host_irq.  Also since we need
 an array for INTx when a function uses INT[BCD].

 So we'll have assigned_dev-host_irqs[], each entry can be INTx or MSI
 or MSIx.

  OK. I think array is reasonable, but the length is a problem, as I did
  before. How long would you like?

 MAX(4, KVM_MAX_MSIX_ENTRIES), no?

Oh, yeah, I misunderstood it(wrong context)...

Need more adjustment on the type, for host_msix_entries is used with 
pci_enable_msix. So I'd like to put it a bit later.

-- 
regards
Yang, Sheng
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/3] KVM SoftMMU fixes

2009-02-18 Thread Joerg Roedel
Hi Avi, Marcelo,

this small patch series fixes two issues and include one cleanup I ran
into hacking in the KVM SoftMMU code. Please consider to apply.

Joerg

diffstat:

 arch/x86/kvm/mmu.c  |   10 +++---
 virt/kvm/kvm_main.c |6 --
 2 files changed, 7 insertions(+), 9 deletions(-)



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] kvm mmu: remove redundant check in mmu_set_spte

2009-02-18 Thread Joerg Roedel
The following code flow is unnecessary:

if (largepage)
was_rmapped = is_large_pte(*shadow_pte);
 else
was_rmapped = 1;

The is_large_pte() function will always evaluate to one here because the
(largepage  !is_large_pte) case is already handled in the first
if-clause. So we can remove this check and set was_rmapped to one always
here.

Signed-off-by: Joerg Roedel joerg.roe...@amd.com
---
 arch/x86/kvm/mmu.c |8 ++--
 1 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index ef060ec..c90b4b2 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1791,12 +1791,8 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 
*shadow_pte,
pgprintk(hfn old %lx new %lx\n,
 spte_to_pfn(*shadow_pte), pfn);
rmap_remove(vcpu-kvm, shadow_pte);
-   } else {
-   if (largepage)
-   was_rmapped = is_large_pte(*shadow_pte);
-   else
-   was_rmapped = 1;
-   }
+   } else
+   was_rmapped = 1;
}
if (set_spte(vcpu, shadow_pte, pte_access, user_fault, write_fault,
  dirty, largepage, global, gfn, pfn, speculative, true)) {
-- 
1.5.6.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] kvm mmu: handle compound pages in kvm_is_mmio_pfn

2009-02-18 Thread Joerg Roedel
The function kvm_is_mmio_pfn is called before put_page is called on a
page by KVM. This is a problem when when this function is called on some
struct page which is part of a compund page. It does not test the
reserved flag of the compound page but of the struct page within the
compount page. This is a problem when KVM works with hugepages allocated
at boot time. These pages have the reserved bit set in all tail pages.
Only the flag in the compount head is cleared. KVM would not put such a
page which results in a memory leak.

Signed-off-by: Joerg Roedel joerg.roe...@amd.com
---
 virt/kvm/kvm_main.c |6 --
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 266bdaf..0ed662d 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -535,8 +535,10 @@ static inline int valid_vcpu(int n)
 
 inline int kvm_is_mmio_pfn(pfn_t pfn)
 {
-   if (pfn_valid(pfn))
-   return PageReserved(pfn_to_page(pfn));
+   if (pfn_valid(pfn)) {
+   struct page *page = compound_head(pfn_to_page(pfn));
+   return PageReserved(page);
+   }
 
return true;
 }
-- 
1.5.6.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] kvm mmu: alloc shadow pages with __GFP_ZERO

2009-02-18 Thread Joerg Roedel
Not using __GFP_ZERO when allocating shadow pages triggers the
assertion in the kvm_mmu_alloc_page() when MMU debugging is enabled.

Signed-off-by: Joerg Roedel joerg.roe...@amd.com
---
 arch/x86/kvm/mmu.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index c90b4b2..d93ecec 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -301,7 +301,7 @@ static int mmu_topup_memory_cache_page(struct 
kvm_mmu_memory_cache *cache,
if (cache-nobjs = min)
return 0;
while (cache-nobjs  ARRAY_SIZE(cache-objects)) {
-   page = alloc_page(GFP_KERNEL);
+   page = alloc_page(GFP_KERNEL | __GFP_ZERO);
if (!page)
return -ENOMEM;
set_page_private(page, 0);
-- 
1.5.6.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] kvm mmu: alloc shadow pages with __GFP_ZERO

2009-02-18 Thread Avi Kivity

Joerg Roedel wrote:

Not using __GFP_ZERO when allocating shadow pages triggers the
assertion in the kvm_mmu_alloc_page() when MMU debugging is enabled.

Signed-off-by: Joerg Roedel joerg.roe...@amd.com
---
 arch/x86/kvm/mmu.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index c90b4b2..d93ecec 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -301,7 +301,7 @@ static int mmu_topup_memory_cache_page(struct 
kvm_mmu_memory_cache *cache,
if (cache-nobjs = min)
return 0;
while (cache-nobjs  ARRAY_SIZE(cache-objects)) {
-   page = alloc_page(GFP_KERNEL);
+   page = alloc_page(GFP_KERNEL | __GFP_ZERO);
if (!page)
return -ENOMEM;
set_page_private(page, 0);
  


What is the warning?

Adding __GFP_ZERO here will cause us to clear the page twice, which is 
wasteful.



--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] kvm mmu: alloc shadow pages with __GFP_ZERO

2009-02-18 Thread Avi Kivity

Joerg Roedel wrote:

The assertion which the attached patch removes fails sometimes. Removing
this assertion is the alternative solution to this problem ;-)

From ca45f3a2e45cd7e76ca624bb1098329db8ff83ab Mon Sep 17 00:00:00 2001
From: Joerg Roedel joerg.roe...@amd.com
Date: Wed, 18 Feb 2009 14:51:13 +0100
Subject: [PATCH] kvm mmu: remove assertion in kvm_mmu_alloc_page

Signed-off-by: Joerg Roedel joerg.roe...@amd.com
---
 arch/x86/kvm/mmu.c |1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index d93ecec..b226973 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -802,7 +802,6 @@ static struct kvm_mmu_page *kvm_mmu_alloc_page(struct 
kvm_vcpu *vcpu,
set_page_private(virt_to_page(sp-spt), (unsigned long)sp);
list_add(sp-link, vcpu-kvm-arch.active_mmu_pages);
INIT_LIST_HEAD(sp-oos_link);
-   ASSERT(is_empty_shadow_page(sp-spt));
bitmap_zero(sp-slot_bitmap, KVM_MEMORY_SLOTS + KVM_PRIVATE_MEM_SLOTS);
sp-multimapped = 0;
sp-parent_pte = parent_pte;
  


sp-spt is allocated using mmu_memory_cache_alloc(), which zeros the 
page.  How can the assertion fail?



--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] kvm mmu: alloc shadow pages with __GFP_ZERO

2009-02-18 Thread Joerg Roedel
On Wed, Feb 18, 2009 at 01:47:04PM +, Avi Kivity wrote:
 Joerg Roedel wrote:
 Not using __GFP_ZERO when allocating shadow pages triggers the
 assertion in the kvm_mmu_alloc_page() when MMU debugging is enabled.
 
 Signed-off-by: Joerg Roedel joerg.roe...@amd.com
 ---
  arch/x86/kvm/mmu.c |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)
 
 diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
 index c90b4b2..d93ecec 100644
 --- a/arch/x86/kvm/mmu.c
 +++ b/arch/x86/kvm/mmu.c
 @@ -301,7 +301,7 @@ static int mmu_topup_memory_cache_page(struct 
 kvm_mmu_memory_cache *cache,
  if (cache-nobjs = min)
  return 0;
  while (cache-nobjs  ARRAY_SIZE(cache-objects)) {
 -page = alloc_page(GFP_KERNEL);
 +page = alloc_page(GFP_KERNEL | __GFP_ZERO);
  if (!page)
  return -ENOMEM;
  set_page_private(page, 0);
   
 
 What is the warning?
 
 Adding __GFP_ZERO here will cause us to clear the page twice, which is 
 wasteful.

The assertion which the attached patch removes fails sometimes. Removing
this assertion is the alternative solution to this problem ;-)

From ca45f3a2e45cd7e76ca624bb1098329db8ff83ab Mon Sep 17 00:00:00 2001
From: Joerg Roedel joerg.roe...@amd.com
Date: Wed, 18 Feb 2009 14:51:13 +0100
Subject: [PATCH] kvm mmu: remove assertion in kvm_mmu_alloc_page

Signed-off-by: Joerg Roedel joerg.roe...@amd.com
---
 arch/x86/kvm/mmu.c |1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index d93ecec..b226973 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -802,7 +802,6 @@ static struct kvm_mmu_page *kvm_mmu_alloc_page(struct 
kvm_vcpu *vcpu,
set_page_private(virt_to_page(sp-spt), (unsigned long)sp);
list_add(sp-link, vcpu-kvm-arch.active_mmu_pages);
INIT_LIST_HEAD(sp-oos_link);
-   ASSERT(is_empty_shadow_page(sp-spt));
bitmap_zero(sp-slot_bitmap, KVM_MEMORY_SLOTS + KVM_PRIVATE_MEM_SLOTS);
sp-multimapped = 0;
sp-parent_pte = parent_pte;
-- 
1.5.6.4

-- 
   | Advanced Micro Devices GmbH
 Operating | Karl-Hammerschmidt-Str. 34, 85609 Dornach bei MĂ¼nchen
 System| 
 Research  | GeschäftsfĂ¼hrer: Jochen Polster, Thomas M. McCoy, Giuliano Meroni
 Center| Sitz: Dornach, Gemeinde Aschheim, Landkreis MĂ¼nchen
   | Registergericht MĂ¼nchen, HRB Nr. 43632

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] kvm mmu: alloc shadow pages with __GFP_ZERO

2009-02-18 Thread Joerg Roedel
On Wed, Feb 18, 2009 at 02:03:34PM +, Avi Kivity wrote:
 Joerg Roedel wrote:
 The assertion which the attached patch removes fails sometimes. Removing
 this assertion is the alternative solution to this problem ;-)
 
 From ca45f3a2e45cd7e76ca624bb1098329db8ff83ab Mon Sep 17 00:00:00 2001
 From: Joerg Roedel joerg.roe...@amd.com
 Date: Wed, 18 Feb 2009 14:51:13 +0100
 Subject: [PATCH] kvm mmu: remove assertion in kvm_mmu_alloc_page
 
 Signed-off-by: Joerg Roedel joerg.roe...@amd.com
 ---
  arch/x86/kvm/mmu.c |1 -
  1 files changed, 0 insertions(+), 1 deletions(-)
 
 diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
 index d93ecec..b226973 100644
 --- a/arch/x86/kvm/mmu.c
 +++ b/arch/x86/kvm/mmu.c
 @@ -802,7 +802,6 @@ static struct kvm_mmu_page *kvm_mmu_alloc_page(struct 
 kvm_vcpu *vcpu,
  set_page_private(virt_to_page(sp-spt), (unsigned long)sp);
  list_add(sp-link, vcpu-kvm-arch.active_mmu_pages);
  INIT_LIST_HEAD(sp-oos_link);
 -ASSERT(is_empty_shadow_page(sp-spt));
  bitmap_zero(sp-slot_bitmap, KVM_MEMORY_SLOTS + KVM_PRIVATE_MEM_SLOTS);
  sp-multimapped = 0;
  sp-parent_pte = parent_pte;
   
 
 sp-spt is allocated using mmu_memory_cache_alloc(), which zeros the page.  
 How can the assertion fail?

In the code I see (current kvm-git) mmu_memory_cache_alloc() does zero
nothing. It takes the page from the preallocated pool and returns it.
The pool itself is filled with mmu_topup_memory_caches() which calls
mmu_topup_memory_cache_page() to fill the mmu_page_cache (from which the
sp-spt page is allocated later). And the mmu_topup_memory_cache_page()
function calls alloc_page() and does not zero the result. This let the
assertion trigger.

Joerg

-- 
   | Advanced Micro Devices GmbH
 Operating | Karl-Hammerschmidt-Str. 34, 85609 Dornach bei MĂ¼nchen
 System| 
 Research  | GeschäftsfĂ¼hrer: Jochen Polster, Thomas M. McCoy, Giuliano Meroni
 Center| Sitz: Dornach, Gemeinde Aschheim, Landkreis MĂ¼nchen
   | Registergericht MĂ¼nchen, HRB Nr. 43632

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] kvm mmu: alloc shadow pages with __GFP_ZERO

2009-02-18 Thread Avi Kivity

Joerg Roedel wrote:

sp-spt is allocated using mmu_memory_cache_alloc(), which zeros the page.  How 
can the assertion fail?



In the code I see (current kvm-git) mmu_memory_cache_alloc() does zero
nothing. It takes the page from the preallocated pool and returns it.
The pool itself is filled with mmu_topup_memory_caches() which calls
mmu_topup_memory_cache_page() to fill the mmu_page_cache (from which the
sp-spt page is allocated later). And the mmu_topup_memory_cache_page()
function calls alloc_page() and does not zero the result. This let the
assertion trigger.
  


Right, I was looking at the 2.6.29 tree.  The patch is correct (and the 
others look good as well).  As usual, I'd like Marcelo to take a look as 
well.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/3 v4] MSI-X enabling

2009-02-18 Thread Sheng Yang

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/4] KVM: Ioctls for init MSI-X entry

2009-02-18 Thread Sheng Yang
Introduce KVM_SET_MSIX_NR and KVM_SET_MSIX_ENTRY two ioctls.

This two ioctls are used by userspace to specific guest device MSI-X entry
number and correlate MSI-X entry with GSI during the initialization stage.

MSI-X should be well initialzed before enabling.

Don't support change MSI-X entry number for now.

Signed-off-by: Sheng Yang sh...@linux.intel.com
---
 include/linux/kvm.h  |   18 
 include/linux/kvm_host.h |   10 
 virt/kvm/kvm_main.c  |  104 ++
 3 files changed, 132 insertions(+), 0 deletions(-)

diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index d742cbf..8e14629 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -475,6 +475,10 @@ struct kvm_irq_routing {
 #define KVM_ASSIGN_IRQ _IOW(KVMIO, 0x70, \
struct kvm_assigned_irq)
 #define KVM_REINJECT_CONTROL  _IO(KVMIO, 0x71)
+#define KVM_ASSIGN_SET_MSIX_NR \
+   _IOW(KVMIO, 0x72, struct kvm_assigned_msix_nr)
+#define KVM_ASSIGN_SET_MSIX_ENTRY \
+   _IOW(KVMIO, 0x73, struct kvm_assigned_msix_entry)
 
 /*
  * ioctls for vcpu fds
@@ -595,4 +599,18 @@ struct kvm_assigned_irq {
 #define KVM_DEV_IRQ_ASSIGN_MSI_ACTION  KVM_DEV_IRQ_ASSIGN_ENABLE_MSI
 #define KVM_DEV_IRQ_ASSIGN_ENABLE_MSI  (1  0)
 
+struct kvm_assigned_msix_nr {
+   __u32 assigned_dev_id;
+   __u16 entry_nr;
+   __u16 padding;
+};
+
+#define KVM_MAX_MSIX_PER_DEV   512
+struct kvm_assigned_msix_entry {
+   __u32 assigned_dev_id;
+   __u32 gsi;
+   __u16 entry; /* The index of entry in the MSI-X table */
+   __u16 padding[3];
+};
+
 #endif
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 7c7096d..33ed9f8 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -319,6 +319,12 @@ struct kvm_irq_ack_notifier {
void (*irq_acked)(struct kvm_irq_ack_notifier *kian);
 };
 
+struct kvm_guest_msix_entry {
+   u32 vector;
+   u16 entry;
+   u16 flags;
+};
+
 struct kvm_assigned_dev_kernel {
struct kvm_irq_ack_notifier ack_notifier;
struct work_struct interrupt_work;
@@ -326,13 +332,17 @@ struct kvm_assigned_dev_kernel {
int assigned_dev_id;
int host_busnr;
int host_devfn;
+   unsigned int entries_nr;
int host_irq;
bool host_irq_disabled;
+   struct msix_entry *host_msix_entries;
int guest_irq;
+   struct kvm_guest_msix_entry *guest_msix_entries;
 #define KVM_ASSIGNED_DEV_GUEST_INTX(1  0)
 #define KVM_ASSIGNED_DEV_GUEST_MSI (1  1)
 #define KVM_ASSIGNED_DEV_HOST_INTX (1  8)
 #define KVM_ASSIGNED_DEV_HOST_MSI  (1  9)
+#define KVM_ASSIGNED_DEV_MSIX  ((1  2) | (1  10))
unsigned long irq_requested_type;
int irq_source_id;
int flags;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 266bdaf..b373466 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1593,6 +1593,88 @@ static int kvm_vcpu_ioctl_set_sigmask(struct kvm_vcpu 
*vcpu, sigset_t *sigset)
return 0;
 }
 
+#ifdef __KVM_HAVE_MSIX
+static int kvm_vm_ioctl_set_msix_nr(struct kvm *kvm,
+   struct kvm_assigned_msix_nr *entry_nr)
+{
+   int r = 0;
+   struct kvm_assigned_dev_kernel *adev;
+
+   mutex_lock(kvm-lock);
+
+   adev = kvm_find_assigned_dev(kvm-arch.assigned_dev_head,
+ entry_nr-assigned_dev_id);
+   if (!adev) {
+   r = -EINVAL;
+   goto msix_nr_out;
+   }
+
+   if (adev-entries_nr == 0) {
+   adev-entries_nr = entry_nr-entry_nr;
+   if (adev-entries_nr == 0 ||
+   adev-entries_nr = KVM_MAX_MSIX_PER_DEV) {
+   r = -EINVAL;
+   goto msix_nr_out;
+   }
+
+   adev-host_msix_entries = kzalloc(sizeof(struct msix_entry) *
+   entry_nr-entry_nr,
+   GFP_KERNEL);
+   if (!adev-host_msix_entries) {
+   r = -ENOMEM;
+   goto msix_nr_out;
+   }
+   adev-guest_msix_entries = kzalloc(
+   sizeof(struct kvm_guest_msix_entry) *
+   entry_nr-entry_nr, GFP_KERNEL);
+   if (!adev-guest_msix_entries) {
+   kfree(adev-host_msix_entries);
+   r = -ENOMEM;
+   goto msix_nr_out;
+   }
+   } else /* Not allowed set MSI-X number twice */
+   r = -EINVAL;
+msix_nr_out:
+   mutex_unlock(kvm-lock);
+   return r;
+}
+
+static int kvm_vm_ioctl_set_msix_entry(struct kvm *kvm,
+  struct kvm_assigned_msix_entry *entry)
+{
+   int r = 0, i;
+   struct kvm_assigned_dev_kernel *adev;
+
+   

[PATCH 3/4] KVM: Add MSI-X interrupt injection logic

2009-02-18 Thread Sheng Yang
We have to handle more than one interrupt with one handler for MSI-X. Avi
suggested to use a flag to indicate the pending. So here is it.

Signed-off-by: Sheng Yang sh...@linux.intel.com
---
 include/linux/kvm_host.h |1 +
 virt/kvm/kvm_main.c  |   66 +-
 2 files changed, 60 insertions(+), 7 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 33ed9f8..5aad46a 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -319,6 +319,7 @@ struct kvm_irq_ack_notifier {
void (*irq_acked)(struct kvm_irq_ack_notifier *kian);
 };
 
+#define KVM_ASSIGNED_MSIX_PENDING  0x1
 struct kvm_guest_msix_entry {
u32 vector;
u16 entry;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index b373466..1e80b6e 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -95,25 +95,69 @@ static struct kvm_assigned_dev_kernel 
*kvm_find_assigned_dev(struct list_head *h
return NULL;
 }
 
+static int find_index_from_host_irq(struct kvm_assigned_dev_kernel
+   *assigned_dev, int irq)
+{
+   int i, index;
+   struct msix_entry *host_msix_entries;
+
+   host_msix_entries = assigned_dev-host_msix_entries;
+
+   index = -1;
+   for (i = 0; i  assigned_dev-entries_nr; i++)
+   if (irq == host_msix_entries[i].vector) {
+   index = i;
+   break;
+   }
+   if (index  0) {
+   printk(KERN_WARNING Fail to find correlated MSI-X entry!\n);
+   return 0;
+   }
+
+   return index;
+}
+
 static void kvm_assigned_dev_interrupt_work_handler(struct work_struct *work)
 {
struct kvm_assigned_dev_kernel *assigned_dev;
+   struct kvm *kvm;
+   int irq, i;
 
assigned_dev = container_of(work, struct kvm_assigned_dev_kernel,
interrupt_work);
+   kvm = assigned_dev-kvm;
 
/* This is taken to safely inject irq inside the guest. When
 * the interrupt injection (or the ioapic code) uses a
 * finer-grained lock, update this
 */
-   mutex_lock(assigned_dev-kvm-lock);
-   kvm_set_irq(assigned_dev-kvm, assigned_dev-irq_source_id,
-   assigned_dev-guest_irq, 1);
-
-   if (assigned_dev-irq_requested_type  KVM_ASSIGNED_DEV_GUEST_MSI) {
-   enable_irq(assigned_dev-host_irq);
-   assigned_dev-host_irq_disabled = false;
+   mutex_lock(kvm-lock);
+   if (assigned_dev-irq_requested_type  KVM_ASSIGNED_DEV_MSIX) {
+   struct kvm_guest_msix_entry *guest_entries =
+   assigned_dev-guest_msix_entries;
+   for (i = 0; i  assigned_dev-entries_nr; i++) {
+   if (!(guest_entries[i].flags 
+   KVM_ASSIGNED_MSIX_PENDING))
+   continue;
+   guest_entries[i].flags = ~KVM_ASSIGNED_MSIX_PENDING;
+   kvm_set_irq(assigned_dev-kvm,
+   assigned_dev-irq_source_id,
+   guest_entries[i].vector, 1);
+   irq = assigned_dev-host_msix_entries[i].vector;
+   if (irq != 0)
+   enable_irq(irq);
+   assigned_dev-host_irq_disabled = false;
+   }
+   } else {
+   kvm_set_irq(assigned_dev-kvm, assigned_dev-irq_source_id,
+   assigned_dev-guest_irq, 1);
+   if (assigned_dev-irq_requested_type 
+   KVM_ASSIGNED_DEV_GUEST_MSI) {
+   enable_irq(assigned_dev-host_irq);
+   assigned_dev-host_irq_disabled = false;
+   }
}
+
mutex_unlock(assigned_dev-kvm-lock);
 }
 
@@ -122,6 +166,14 @@ static irqreturn_t kvm_assigned_dev_intr(int irq, void 
*dev_id)
struct kvm_assigned_dev_kernel *assigned_dev =
(struct kvm_assigned_dev_kernel *) dev_id;
 
+   if (assigned_dev-irq_requested_type == KVM_ASSIGNED_DEV_MSIX) {
+   int index = find_index_from_host_irq(assigned_dev, irq);
+   if (index  0)
+   return IRQ_HANDLED;
+   assigned_dev-guest_msix_entries[index].flags |=
+   KVM_ASSIGNED_MSIX_PENDING;
+   }
+
schedule_work(assigned_dev-interrupt_work);
 
disable_irq_nosync(irq);
-- 
1.5.4.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/4] KVM: Enable MSI-X for KVM assigned device

2009-02-18 Thread Sheng Yang
This patch finally enable MSI-X.

What we need for MSI-X:
1. Intercept one page in MMIO region of device. So that we can get guest desired
MSI-X table and set up the real one. Now this have been done by guest, and
transfer to kernel using ioctl KVM_SET_MSIX_NR and KVM_SET_MSIX_ENTRY.

2. Information for incoming interrupt. Now one device can have more than one
interrupt, and they are all handled by one workqueue structure. So we need to
identify them. The previous patch enable gsi_msg_pending_bitmap get this done.

3. Mapping from host IRQ to guest gsi as well as guest gsi to real MSI/MSI-X
message address/data. We used same entry number for the host and guest here, so
that it's easy to find the correlated guest gsi.

What we lack for now:
1. The PCI spec said nothing can existed with MSI-X table in the same page of
MMIO region, except pending bits. The patch ignore pending bits as the first
step (so they are always 0 - no pending).

2. The PCI spec allowed to change MSI-X table dynamically. That means, the OS
can enable MSI-X, then mask one MSI-X entry, modify it, and unmask it. The patch
didn't support this, and Linux also don't work in this way.

3. The patch didn't implement MSI-X mask all and mask single entry. I would
implement the former in driver/pci/msi.c later. And for single entry, userspace
should have reposibility to handle it.

Signed-off-by: Sheng Yang sh...@linux.intel.com
---
 arch/x86/include/asm/kvm.h |1 +
 include/linux/kvm.h|8 
 virt/kvm/kvm_main.c|   98 +---
 3 files changed, 101 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/kvm.h b/arch/x86/include/asm/kvm.h
index dc3f6cf..125be8b 100644
--- a/arch/x86/include/asm/kvm.h
+++ b/arch/x86/include/asm/kvm.h
@@ -16,6 +16,7 @@
 #define __KVM_HAVE_MSI
 #define __KVM_HAVE_USER_NMI
 #define __KVM_HAVE_GUEST_DEBUG
+#define __KVM_HAVE_MSIX
 
 /* Architectural interrupt line count. */
 #define KVM_NR_INTERRUPTS 256
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 8e14629..470a43c 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -402,6 +402,9 @@ struct kvm_trace_rec {
 #ifdef __KVM_HAVE_IOAPIC
 #define KVM_CAP_IRQ_ROUTING 25
 #endif
+#ifdef __KVM_HAVE_MSIX
+#define KVM_CAP_DEVICE_MSIX 26
+#endif
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -599,6 +602,11 @@ struct kvm_assigned_irq {
 #define KVM_DEV_IRQ_ASSIGN_MSI_ACTION  KVM_DEV_IRQ_ASSIGN_ENABLE_MSI
 #define KVM_DEV_IRQ_ASSIGN_ENABLE_MSI  (1  0)
 
+#define KVM_DEV_IRQ_ASSIGN_MSIX_ACTION  (KVM_DEV_IRQ_ASSIGN_ENABLE_MSIX |\
+   KVM_DEV_IRQ_ASSIGN_MASK_MSIX)
+#define KVM_DEV_IRQ_ASSIGN_ENABLE_MSIX  (1  1)
+#define KVM_DEV_IRQ_ASSIGN_MASK_MSIX(1  2)
+
 struct kvm_assigned_msix_nr {
__u32 assigned_dev_id;
__u16 entry_nr;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 1e80b6e..b1f2399 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -236,13 +236,33 @@ static void kvm_free_assigned_irq(struct kvm *kvm,
 * now, the kvm state is still legal for probably we also have to wait
 * interrupt_work done.
 */
-   disable_irq_nosync(assigned_dev-host_irq);
-   cancel_work_sync(assigned_dev-interrupt_work);
+   if (assigned_dev-irq_requested_type  KVM_ASSIGNED_DEV_MSIX) {
+   int i;
+   for (i = 0; i  assigned_dev-entries_nr; i++)
+   disable_irq_nosync(assigned_dev-
+  host_msix_entries[i].vector);
+
+   cancel_work_sync(assigned_dev-interrupt_work);
 
-   free_irq(assigned_dev-host_irq, (void *)assigned_dev);
+   for (i = 0; i  assigned_dev-entries_nr; i++)
+   free_irq(assigned_dev-host_msix_entries[i].vector,
+(void *)assigned_dev);
 
-   if (assigned_dev-irq_requested_type  KVM_ASSIGNED_DEV_HOST_MSI)
-   pci_disable_msi(assigned_dev-dev);
+   assigned_dev-entries_nr = 0;
+   kfree(assigned_dev-host_msix_entries);
+   kfree(assigned_dev-guest_msix_entries);
+   pci_disable_msix(assigned_dev-dev);
+   } else {
+   /* Deal with MSI and INTx */
+   disable_irq_nosync(assigned_dev-host_irq);
+   cancel_work_sync(assigned_dev-interrupt_work);
+
+   free_irq(assigned_dev-host_irq, (void *)assigned_dev);
+
+   if (assigned_dev-irq_requested_type 
+   KVM_ASSIGNED_DEV_HOST_MSI)
+   pci_disable_msi(assigned_dev-dev);
+   }
 
assigned_dev-irq_requested_type = 0;
 }
@@ -373,6 +393,60 @@ static int assigned_device_update_msi(struct kvm *kvm,
 }
 #endif
 
+#ifdef __KVM_HAVE_MSIX
+static int assigned_device_update_msix(struct kvm *kvm,
+   struct kvm_assigned_dev_kernel *adev,
+   struct kvm_assigned_irq *airq)
+{
+   

[PATCH 1/4] KVM: Fix wrong usage of _IOR in assigned device interface

2009-02-18 Thread Sheng Yang
_IOR for copy_to_user and _IOW for copy_from_user...

Noticed by Avi.

Signed-off-by: Sheng Yang sh...@linux.intel.com
---
 include/linux/kvm.h |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 2163b3d..d742cbf 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -469,10 +469,10 @@ struct kvm_irq_routing {
_IOW(KVMIO,  0x67, struct kvm_coalesced_mmio_zone)
 #define KVM_UNREGISTER_COALESCED_MMIO \
_IOW(KVMIO,  0x68, struct kvm_coalesced_mmio_zone)
-#define KVM_ASSIGN_PCI_DEVICE _IOR(KVMIO, 0x69, \
+#define KVM_ASSIGN_PCI_DEVICE _IOW(KVMIO, 0x69, \
   struct kvm_assigned_pci_dev)
 #define KVM_SET_GSI_ROUTING   _IOW(KVMIO, 0x6a, struct kvm_irq_routing)
-#define KVM_ASSIGN_IRQ _IOR(KVMIO, 0x70, \
+#define KVM_ASSIGN_IRQ _IOW(KVMIO, 0x70, \
struct kvm_assigned_irq)
 #define KVM_REINJECT_CONTROL  _IO(KVMIO, 0x71)
 
-- 
1.5.4.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/4] KVM: Fix wrong usage of _IOR in assigned device interface

2009-02-18 Thread Avi Kivity

Sheng Yang wrote:

_IOR for copy_to_user and _IOW for copy_from_user...

Noticed by Avi.

Signed-off-by: Sheng Yang sh...@linux.intel.com
---
 include/linux/kvm.h |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 2163b3d..d742cbf 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -469,10 +469,10 @@ struct kvm_irq_routing {
_IOW(KVMIO,  0x67, struct kvm_coalesced_mmio_zone)
 #define KVM_UNREGISTER_COALESCED_MMIO \
_IOW(KVMIO,  0x68, struct kvm_coalesced_mmio_zone)
-#define KVM_ASSIGN_PCI_DEVICE _IOR(KVMIO, 0x69, \
+#define KVM_ASSIGN_PCI_DEVICE _IOW(KVMIO, 0x69, \
   struct kvm_assigned_pci_dev)
 #define KVM_SET_GSI_ROUTING   _IOW(KVMIO, 0x6a, struct kvm_irq_routing)
-#define KVM_ASSIGN_IRQ _IOR(KVMIO, 0x70, \
+#define KVM_ASSIGN_IRQ _IOW(KVMIO, 0x70, \
struct kvm_assigned_irq)
 #define KVM_REINJECT_CONTROL  _IO(KVMIO, 0x71)
 
  


KVM_ASSIGN_PCI_DEVICE was introduced in 2.6.28.  We can't fix it since 
it's part of the ABI.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Houston, we have May 15, 1953 (says guest when host uses cpufreq, and dies)

2009-02-18 Thread Tomasz Chmielewski

Is using cpufreq (i.e. with ondemand governor) on KVM host safe for guests?

I enabled cpufreq on the host, it scaled down the host CPU (Dual-Core 
AMD Opteron(tm) Processor 2212) to 1 GHz from 2 GHz.


Guest (using 1 CPU) was still showing that it has a 2 GHz CPU in 
/proc/cpuinfo (I guess this value is read only once, when booting).


After about 2 hours I started date on the guest - it showed that it's 
year *1953*, after which I couldn't start any other command (the guest 
was technically alive - SSH connection to it didn't die - but I couldn't 
do anything).


# date
Wed Feb 18 13:07:17 CET 2009

[let's wait ~2 hours]


# date
Fri May 15 10:13:14 CET 1953
# date
^C^Z
[could not interrupt]


Is it expected behaviour? Is it correct behaviour?


--
Tomasz Chmielewski
http://wpkg.org
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Houston, we have May 15, 1953 (says guest when host uses cpufreq, and dies)

2009-02-18 Thread Anthony Liguori

Tomasz Chmielewski wrote:
Is using cpufreq (i.e. with ondemand governor) on KVM host safe for 
guests?


I enabled cpufreq on the host, it scaled down the host CPU (Dual-Core 
AMD Opteron(tm) Processor 2212) to 1 GHz from 2 GHz.


Not with your processor.  Intel processors should be fine and any AMD 
processor that's Barcelona/Phenom or newer.


Regards,

Anthony Liguori

Guest (using 1 CPU) was still showing that it has a 2 GHz CPU in 
/proc/cpuinfo (I guess this value is read only once, when booting).


After about 2 hours I started date on the guest - it showed that 
it's year *1953*, after which I couldn't start any other command (the 
guest was technically alive - SSH connection to it didn't die - but I 
couldn't do anything).


# date
Wed Feb 18 13:07:17 CET 2009

[let's wait ~2 hours]


# date
Fri May 15 10:13:14 CET 1953
# date
^C^Z
[could not interrupt]


Is it expected behaviour? Is it correct behaviour?




--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Houston, we have May 15, 1953 (says guest when host uses cpufreq, and dies)

2009-02-18 Thread Tomasz Chmielewski

Anthony Liguori schrieb:

Tomasz Chmielewski wrote:
Is using cpufreq (i.e. with ondemand governor) on KVM host safe for 
guests?


I enabled cpufreq on the host, it scaled down the host CPU (Dual-Core 
AMD Opteron(tm) Processor 2212) to 1 GHz from 2 GHz.


Not with your processor.  Intel processors should be fine and any AMD 
processor that's Barcelona/Phenom or newer.


Looks I'm a bad, bad, anti-environment CO2 contributor then.

From a technical perspective, what are the problems with my CPU that it 
scales down on the host just fine, but makes the guests return to the 
past?



--
Tomasz Chmielewski
http://wpkg.org
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: copyless virtio net thoughts?

2009-02-18 Thread Arnd Bergmann
On Wednesday 18 February 2009, Rusty Russell wrote:

 2) Direct NIC attachment
 This is particularly interesting with SR-IOV or other multiqueue nics,
 but for boutique cases or benchmarks, could be for normal NICs.  So
 far I have some very sketched-out patches: for the attached nic 
 dev_alloc_skb() gets an skb from the guest (which supplies them via
 some kind of AIO interface), and a branch in netif_receive_skb()
 which returned it to the guest.  This bypasses all firewalling in
 the host though; we're basically having the guest process drive
 the NIC directly.   

If this is not passing the PCI device directly to the guest, but
uses your concept, wouldn't it still be possible to use the firewalling
in the host? You can always inspect the headers, drop the frame, etc
without copying the whole frame at any point.

When it gets to the point of actually giving the (real pf or sr-iov vf)
to one guest, you really get to the point where you can't do local
firewalling any more.

 3) Direct interguest networking
 Anthony has been thinking here: vmsplice has already been mentioned.
 The idea of passing directly from one guest to another is an
 interesting one: using dma engines might be possible too.  Again,
 host can't firewall this traffic.  Simplest as a dedicated internal
 lan NIC, but we could theoretically do a fast-path for certain MAC
 addresses on a general guest NIC. 

Another option would be to use an SR-IOV adapter from multiple guests,
with a virtual ethernet bridge in the adapter. This moves the overhead
from the CPU to the bus and/or adapter, so it may or may not be a real
benefit depending on the workload.

Arnd 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Houston, we have May 15, 1953 (says guest when host uses cpufreq, and dies)

2009-02-18 Thread Avi Kivity

Tomasz Chmielewski wrote:

Looks I'm a bad, bad, anti-environment CO2 contributor then.

From a technical perspective, what are the problems with my CPU that 
it scales down on the host just fine, but makes the guests return to 
the past?


What kvm version are you using?  kvm-84 should fix this.

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Houston, we have May 15, 1953 (says guest when host uses cpufreq, and dies)

2009-02-18 Thread Tomasz Chmielewski

Avi Kivity schrieb:

Tomasz Chmielewski wrote:

Looks I'm a bad, bad, anti-environment CO2 contributor then.

From a technical perspective, what are the problems with my CPU that 
it scales down on the host just fine, but makes the guests return to 
the past?


What kvm version are you using?  kvm-84 should fix this.


I still have kvm-83.

Thanks for the hint, I'll see how it works.


--
Tomasz Chmielewski
http://wpkg.org


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Current KVM head crashes on startup

2009-02-18 Thread Avi Kivity

Brian Kress wrote:
When I try to run KVM built off the current head, it crashes with a 
Segmentation fault.  KVM-84 does

not.  Seems to be dealing with the CPUID changes:


   0x081a5c70 in host_cpuid ()
   at /home/kressb/kvm/src/qemu/target-i386/helper.c:1426
   1426asm volatile(pusha \n\t


I've pushed a fix for this.

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/8] kvm: qemu: fix hot assign device

2009-02-18 Thread Marcelo Tosatti


Acked-by: Marcelo Tosatti mtosa...@redhat.com

On Wed, Feb 18, 2009 at 03:12:31PM +0800, Han, Weidong wrote:
 Last qemu merge broke device assignment hotplug. Call
 qemu_pci_hot_assign_device in pci_device_hot_add for
 hot assign device, and add the command for it.
 for example hot assign 01:00.0, can use following command:
   pci_add pci_addr=auto host host=01:00.0
 
 Signed-off-by: Weidong Han weidong@intel.com
 ---
  qemu/hw/device-hotplug.c |   37 -
  qemu/hw/pci-hotplug.c|   35 +++
  qemu/monitor.c   |2 +-
  3 files changed, 36 insertions(+), 38 deletions(-)
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] kvm: qemu: deassign device from guest

2009-02-18 Thread Marcelo Tosatti
Weidong,

Does this set fix

http://sourceforge.net/tracker2/?func=detailaid=2432316group_id=180599atid=893831


On Wed, Feb 18, 2009 at 03:13:05PM +0800, Han, Weidong wrote:
 free_assigned_device just frees device from qemu, it should also
 deassign the device from guest when guest exits or hot remove
 assigned device.
 
 Acked-by: Mark McLoughlin mar...@redhat.com
 Signed-off-by: Weidong Han weidong@intel.com
 ---
  qemu/hw/device-assignment.c |   28 ++--
  qemu/hw/device-assignment.h |1 +
  2 files changed, 27 insertions(+), 2 deletions(-)
 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] kvm mmu: handle compound pages in kvm_is_mmio_pfn

2009-02-18 Thread Marcelo Tosatti


BTW some page bits are erroneously transferred to the struct page's
within the compound page. We've got away with that so far because
these bits (such as dirty and accessed) are not used by the limited
hugetlb/hugetlbfs implementation ATM.

Acked-by: Marcelo Tosatti mtosa...@redhat.com

On Wed, Feb 18, 2009 at 02:08:58PM +0100, Joerg Roedel wrote:
 The function kvm_is_mmio_pfn is called before put_page is called on a
 page by KVM. This is a problem when when this function is called on some
 struct page which is part of a compund page. It does not test the
 reserved flag of the compound page but of the struct page within the
 compount page. This is a problem when KVM works with hugepages allocated
 at boot time. These pages have the reserved bit set in all tail pages.
 Only the flag in the compount head is cleared. KVM would not put such a
 page which results in a memory leak.
 
 Signed-off-by: Joerg Roedel joerg.roe...@amd.com
 ---
  virt/kvm/kvm_main.c |6 --
  1 files changed, 4 insertions(+), 2 deletions(-)
 
 diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
 index 266bdaf..0ed662d 100644
 --- a/virt/kvm/kvm_main.c
 +++ b/virt/kvm/kvm_main.c
 @@ -535,8 +535,10 @@ static inline int valid_vcpu(int n)
  
  inline int kvm_is_mmio_pfn(pfn_t pfn)
  {
 - if (pfn_valid(pfn))
 - return PageReserved(pfn_to_page(pfn));
 + if (pfn_valid(pfn)) {
 + struct page *page = compound_head(pfn_to_page(pfn));
 + return PageReserved(page);
 + }
  
   return true;
  }

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] kvm mmu: remove redundant check in mmu_set_spte

2009-02-18 Thread Marcelo Tosatti

The following code flow is unnecessary:

if (largepage)
was_rmapped = is_large_pte(*shadow_pte);
 else
was_rmapped = 1;

The is_large_pte() function will always evaluate to one here because the
(largepage  !is_large_pte) case is already handled in the first
if-clause. So we can remove this check and set was_rmapped to one always
here.

Signed-off-by: Joerg Roedel joerg.roe...@amd.com
Acked-by: Marcelo Tosatti mtosa...@redhat.com

---
 arch/x86/kvm/mmu.c |8 ++--
 1 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index ef060ec..c90b4b2 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1791,12 +1791,8 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 
*shadow_pte,
pgprintk(hfn old %lx new %lx\n,
 spte_to_pfn(*shadow_pte), pfn);
rmap_remove(vcpu-kvm, shadow_pte);
-   } else {
-   if (largepage)
-   was_rmapped = is_large_pte(*shadow_pte);
-   else
-   was_rmapped = 1;
-   }
+   } else
+   was_rmapped = 1;
}
if (set_spte(vcpu, shadow_pte, pte_access, user_fault, write_fault,
  dirty, largepage, global, gfn, pfn, speculative, true)) {
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] kvm mmu: alloc shadow pages with __GFP_ZERO

2009-02-18 Thread Marcelo Tosatti
On Wed, Feb 18, 2009 at 02:54:37PM +0100, Joerg Roedel wrote:
  Adding __GFP_ZERO here will cause us to clear the page twice, which is 
  wasteful.
 
 The assertion which the attached patch removes fails sometimes. Removing
 this assertion is the alternative solution to this problem ;-)
 

From: Joerg Roedel joerg.roe...@amd.com
Date: Wed, 18 Feb 2009 14:51:13 +0100
Subject: [PATCH] kvm mmu: remove assertion in kvm_mmu_alloc_page

Signed-off-by: Joerg Roedel joerg.roe...@amd.com
Acked-by: Marcelo Tosatti mtosa...@redhat.com

---
 arch/x86/kvm/mmu.c |1 -
 1 files changed, 0 insertions(+), 1 deletions(-)
 
 diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
 index d93ecec..b226973 100644
 --- a/arch/x86/kvm/mmu.c
 +++ b/arch/x86/kvm/mmu.c
 @@ -802,7 +802,6 @@ static struct kvm_mmu_page *kvm_mmu_alloc_page(struct 
kvm_vcpu *vcpu,
set_page_private(virt_to_page(sp-spt), (unsigned long)sp);
list_add(sp-link, vcpu-kvm-arch.active_mmu_pages);
INIT_LIST_HEAD(sp-oos_link);
 -  ASSERT(is_empty_shadow_page(sp-spt));
bitmap_zero(sp-slot_bitmap, KVM_MEMORY_SLOTS + KVM_PRIVATE_MEM_SLOTS);
sp-multimapped = 0;
sp-parent_pte = parent_pte;
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Houston, we have May 15, 1953 (says guest when host uses cpufreq, and dies)

2009-02-18 Thread Marcelo Tosatti
On Wed, Feb 18, 2009 at 03:51:22PM +0100, Tomasz Chmielewski wrote:
 Is using cpufreq (i.e. with ondemand governor) on KVM host safe for guests?

 I enabled cpufreq on the host, it scaled down the host CPU (Dual-Core  
 AMD Opteron(tm) Processor 2212) to 1 GHz from 2 GHz.

 Guest (using 1 CPU) was still showing that it has a 2 GHz CPU in  
 /proc/cpuinfo (I guess this value is read only once, when booting).

 After about 2 hours I started date on the guest - it showed that it's  
 year *1953*, after which I couldn't start any other command (the guest  
 was technically alive - SSH connection to it didn't die - but I couldn't  
 do anything).

 # date
 Wed Feb 18 13:07:17 CET 2009

 [let's wait ~2 hours]


 # date
 Fri May 15 10:13:14 CET 1953
 # date
 ^C^Z
 [could not interrupt]


 Is it expected behaviour? Is it correct behaviour?

Whats the output of /proc/cpuinfo on the host? Does it contain the
constant_tsc flag?

Whats the output of
/sys/devices/system/clocksource/clocksource0/current_clocksource
on the guest?


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Houston, we have May 15, 1953 (says guest when host uses cpufreq, and dies)

2009-02-18 Thread Tomasz Chmielewski

Marcelo Tosatti schrieb:

On Wed, Feb 18, 2009 at 03:51:22PM +0100, Tomasz Chmielewski wrote:

Is using cpufreq (i.e. with ondemand governor) on KVM host safe for guests?

I enabled cpufreq on the host, it scaled down the host CPU (Dual-Core  
AMD Opteron(tm) Processor 2212) to 1 GHz from 2 GHz.


Guest (using 1 CPU) was still showing that it has a 2 GHz CPU in  
/proc/cpuinfo (I guess this value is read only once, when booting).


After about 2 hours I started date on the guest - it showed that it's  
year *1953*, after which I couldn't start any other command (the guest  
was technically alive - SSH connection to it didn't die - but I couldn't  
do anything).


# date
Wed Feb 18 13:07:17 CET 2009

[let's wait ~2 hours]


# date
Fri May 15 10:13:14 CET 1953
# date
^C^Z
[could not interrupt]


Is it expected behaviour? Is it correct behaviour?


Whats the output of /proc/cpuinfo on the host? Does it contain the
constant_tsc flag?


It doesn't contain this flag.
/proc/cpuinfo output - below.



Whats the output of
/sys/devices/system/clocksource/clocksource0/current_clocksource
on the guest?


#  cat /sys/devices/system/clocksource/clocksource0/*
hpet acpi_pm jiffies tsc - available
hpet - current



# cat /proc/cpuinfo   
processor   : 0  
vendor_id   : AuthenticAMD   
cpu family  : 15 
model   : 65 
model name  : Dual-Core AMD Opteron(tm) Processor 2212   
stepping: 2  
cpu MHz : 2000.000   
cache size  : 1024 KB
physical id : 0  
siblings: 2  
core id : 0  
cpu cores   : 2  
fpu : yes
fpu_exception   : yes
cpuid level : 1  
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow rep_good pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy
bogomips: 3993.20
TLB size: 1024 4K pages  
clflush size: 64 
cache_alignment : 64 
address sizes   : 40 bits physical, 48 bits virtual  
power management: ts fid vid ttp tm stc  


processor   : 1
vendor_id   : AuthenticAMD
cpu family  : 15  
model   : 65  
model name  : Dual-Core AMD Opteron(tm) Processor 2212
stepping: 2   
cpu MHz : 2000.000
cache size  : 1024 KB 
physical id : 0   
siblings: 2   
core id : 1   
cpu cores   : 2   
fpu : yes 
fpu_exception   : yes 
cpuid level : 1   
wp  : yes 
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow rep_good pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy
bogomips: 3993.20
TLB size: 1024 4K pages  
clflush size: 64 
cache_alignment : 64 
address sizes   : 40 bits physical, 48 bits virtual

Re: Houston, we have May 15, 1953 (says guest when host uses cpufreq, and dies)

2009-02-18 Thread Marcelo Tosatti
On Wed, Feb 18, 2009 at 07:53:11PM +0100, Tomasz Chmielewski wrote:
 processor   : 2
 vendor_id   : AuthenticAMD
 cpu family  : 15  model   : 65  model name
   : Dual-Core AMD Opteron(tm) Processor 2212
 stepping: 2   cpu MHz 
 : 2000.000cache size  : 1024 KB
 physical id : 1
 siblings: 2
 core id : 0
 cpu cores   : 2
 fpu : yes
 fpu_exception   : yes
 cpuid level : 1
 wp  : yes
 flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca 
 cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp 
 lm 3dnowext 3dnow rep_good pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy
 bogomips: 3993.20
 TLB size: 1024 4K pages
 clflush size: 64
 cache_alignment : 64
 address sizes   : 40 bits physical, 48 bits virtual
 power management: ts fid vid ttp tm stc

kvm-84 as mentioned. Sorry. 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Houston, we have May 15, 1953 (says guest when host uses cpufreq, and dies)

2009-02-18 Thread Tomasz Chmielewski

Marcelo Tosatti schrieb:


flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 
3dnowext 3dnow rep_good pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy
bogomips: 3993.20
TLB size: 1024 4K pages
clflush size: 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc


kvm-84 as mentioned. Sorry. 


It's OK as long as it will work.

- will Windows guests work?

- what CPU frequency will the guests show? Current host frequency? Host 
frequency from the moment the guest booted (i.e. right now the guest 
will show 1GHz even if the host is running at 2GHz, or the way around)?



--
Tomasz Chmielewski
http://wpkg.org

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Houston, we have May 15, 1953 (says guest when host uses cpufreq, and dies)

2009-02-18 Thread Marcelo Tosatti
On Wed, Feb 18, 2009 at 08:07:48PM +0100, Tomasz Chmielewski wrote:
 Marcelo Tosatti schrieb:

 flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca 
 cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt 
 rdtscp lm 3dnowext 3dnow rep_good pni cx16 lahf_lm cmp_legacy svm extapic 
 cr8_legacy
 bogomips: 3993.20
 TLB size: 1024 4K pages
 clflush size: 64
 cache_alignment : 64
 address sizes   : 40 bits physical, 48 bits virtual
 power management: ts fid vid ttp tm stc

 kvm-84 as mentioned. Sorry. 

 It's OK as long as it will work.

 - will Windows guests work?

They should. If they don't, please report.

 - what CPU frequency will the guests show? Current host frequency? Host  
 frequency from the moment the guest booted (i.e. right now the guest  
 will show 1GHz even if the host is running at 2GHz, or the way around)?

Host frequency from the moment the guest booted, since the guest does
not receive frequency change notifications.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Houston, we have May 15, 1953 (says guest when host uses cpufreq, and dies)

2009-02-18 Thread Tomasz Chmielewski

Marcelo Tosatti schrieb:

- what CPU frequency will the guests show? Current host frequency? Host  
frequency from the moment the guest booted (i.e. right now the guest  
will show 1GHz even if the host is running at 2GHz, or the way around)?


Host frequency from the moment the guest booted, since the guest does
not receive frequency change notifications.


Is it possible (or is it planned) to pass frequency to the guest (the 
one which is displayed in /proc/cpuinfo)?


Someone may feel disappointed to see his/her brand new virtual guest has 
a CPU with so few MHz advertised in /proc/cpuinfo.



--
Tomasz Chmielewski
http://wpkg.org

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Houston, we have May 15, 1953 (says guest when host uses cpufreq, and dies)

2009-02-18 Thread Marcelo Tosatti
On Wed, Feb 18, 2009 at 08:18:50PM +0100, Tomasz Chmielewski wrote:
 Marcelo Tosatti schrieb:

 - what CPU frequency will the guests show? Current host frequency? 
 Host  frequency from the moment the guest booted (i.e. right now the 
 guest  will show 1GHz even if the host is running at 2GHz, or the way 
 around)?

 Host frequency from the moment the guest booted, since the guest does
 not receive frequency change notifications.

 Is it possible (or is it planned) to pass frequency to the guest (the  
 one which is displayed in /proc/cpuinfo)?

Possible, not planned AFAIK.

 Someone may feel disappointed to see his/her brand new virtual guest has  
 a CPU with so few MHz advertised in /proc/cpuinfo.

Thats a point.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Houston, we have May 15, 1953 (says guest when host uses cpufreq, and dies)

2009-02-18 Thread Tomasz Chmielewski

Marcelo Tosatti schrieb:

On Wed, Feb 18, 2009 at 08:18:50PM +0100, Tomasz Chmielewski wrote:

Marcelo Tosatti schrieb:

- what CPU frequency will the guests show? Current host frequency? 
Host  frequency from the moment the guest booted (i.e. right now the 
guest  will show 1GHz even if the host is running at 2GHz, or the way 
around)?

Host frequency from the moment the guest booted, since the guest does
not receive frequency change notifications.
Is it possible (or is it planned) to pass frequency to the guest (the  
one which is displayed in /proc/cpuinfo)?


Possible, not planned AFAIK.


Possible, right now? How?


--
Tomasz Chmielewski
http://wpkg.org
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Houston, we have May 15, 1953 (says guest when host uses cpufreq, and dies)

2009-02-18 Thread Marcelo Tosatti
On Wed, Feb 18, 2009 at 09:02:31PM +0100, Tomasz Chmielewski wrote:
 Marcelo Tosatti schrieb:
 On Wed, Feb 18, 2009 at 08:18:50PM +0100, Tomasz Chmielewski wrote:
 Marcelo Tosatti schrieb:

 - what CPU frequency will the guests show? Current host 
 frequency? Host  frequency from the moment the guest booted (i.e. 
 right now the guest  will show 1GHz even if the host is running 
 at 2GHz, or the way around)?
 Host frequency from the moment the guest booted, since the guest does
 not receive frequency change notifications.
 Is it possible (or is it planned) to pass frequency to the guest (the 
  one which is displayed in /proc/cpuinfo)?

 Possible, not planned AFAIK.

 Possible, right now? How?

Write a paravirt notification scheme.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Houston, we have May 15, 1953 (says guest when host uses cpufreq, and dies)

2009-02-18 Thread Anthony Liguori

Avi Kivity wrote:

Tomasz Chmielewski wrote:

Looks I'm a bad, bad, anti-environment CO2 contributor then.

From a technical perspective, what are the problems with my CPU that 
it scales down on the host just fine, but makes the guests return to 
the past?


What kvm version are you using?  kvm-84 should fix this.


Are you suggesting that one should use cpufreq on a CPU without a 
constant tsc?  Isn't this just asking for trouble?


Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[ kvm-Bugs-2556746 ] FreeBSD/PC-BSD text screen corruption

2009-02-18 Thread SourceForge.net
Bugs item #2556746, was opened at 2009-02-02 13:19
Message generated for change (Comment added) made by aurel32
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detailatid=893831aid=2556746group_id=180599

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: intel
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Tim Knowles (knowlet)
Assigned to: Nobody/Anonymous (nobody)
Summary: FreeBSD/PC-BSD text screen corruption 

Initial Comment:
Using either kvm-83, kvm-82 or kvm-81 I am unable to install FreeBSD or PC BSD 
due to screen corruption (screenshot attached).  The initial boot menu is shown 
and is legible.  Once you have selected the boot option  the boot process 
continues the screen becomes corrupted.  I initially discovered the problem 
when setting up an LVM backed guest in virt-manager but I have attached a 
minimal cmd line below that allows you to trigger it.

1) It would appear that this problem was introduced in kvm-81 (kvm-80 does not 
exhibit the problem with FBSD or PCBSD but I have not tested any other versions 
of kvm)
2) If I use the -no-kvm switch with KVM-83 this problem does not occur.

Details:
Host: 1 x Intel Core i7 920, Fedora 10 64bit. 6GB memory (Dell Studio XPS 435)
kvm-83: self compiled - gcc version 4.3.2 20081105 (Red Hat 4.3.2-7)
cmd line:  /usr/local/bin/qemu-system-x86_64 -m 512 -cdrom 
7.1-RELEASE-amd64-dvd1.iso

Guests:
FreeBSD 7,1
PC-BSD 7.0.2

PS: I'd also like to add my thanks for creating KVM, it's fabulous tool. Many 
thanks

--

Comment By: Aurelien Jarno (aurel32)
Date: 2009-02-18 22:38

Message:
This is fixed in revision 6628 of QEMU, so probably soon in KVM. Any
workaround to this bug as suggested ahead is a bad idea, as the screen is
probably not the only affected by this bug. This means that some data can
be corrupted.

--

Comment By: Radek Hladik (kedarius)
Date: 2009-02-04 20:06

Message:
Confirming the problem too. 
kvm-83-2.fc11.x86_64
libvirt-0.6.0-1.fc11.x86_64
virt-manager-0.6.1-1.fc11.x86_64
qemu-0.9.1-12.fc11.x86_64

For the libvirt and virt-manager users, how they can use the workaround
mentioned by toxxic:
Press 6 in the boot, type 
set console=comconsole
use view-serial consoles and type 
boot
(choose xterm as term type)



--

Comment By: Jeff (toxxic)
Date: 2009-02-04 08:57

Message:

I can confirm this happens, when using VNC for the console.

Here's a workaround:

Start kvm with a -serial flag.  You're going to use it as a serial
console.
qemu-system-x86_64 -serial telnet::2226,server,nowait -cdrom
7.1-RELEASE-amd64-disc1.iso [...]

Then connect to port 2226:
telnet localhost 2226

Then when you boot FreeBSD CD, and the (legible) boot loader comes up.
choose 6. Escape to loader prompt

At the OK prompt, type:  set console=comconsole

The OK prompt will now appear in your telnet session.  Type boot and hit
return.  Continue with legible FreeBSD install via your telnet session.

You may want to set up a serial console on the FreeBSD system that you
installed, as well.

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detailatid=893831aid=2556746group_id=180599
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 01/02] ia64: Move the macro definitions related to MSI to one header file.

2009-02-18 Thread Marcelo Tosatti

Looks good, should go through Tony's tree I believe.

On Wed, Feb 18, 2009 at 10:17:56AM +0800, Zhang, Xiantao wrote:
 Thanks, Tony! It should not break anything due to no changes about code 
 logic. :)
 
 Avi, 
   Could you help to commit the patches with Tony's Ack ? Thanks!
 Xiantao
 
 
 Luck, Tony wrote:
  For supporting kvm's MSI, we have to move some macros
  from ia64_msi.c out to avoide duplicate them. In addition,
  to keep them consistent with x86's , I also changed some
  macros' name.  How do you think of the patch ?  If you
  agree to the changes, could you add your Sign-off-by to
  the patch, and Avi may check-in it to kvm.git first to
  fix an emergent build issue for kvm/ia64.   Thanks!
  
  Looks OK to me (I didn't test it, or even build it ... so I hope you
  did!). 
  
  Acked-by: Tony Luck tony.l...@intel.com
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recent kvm and vmware server comparisons?

2009-02-18 Thread Thomas Fjellstrom
On Wednesday 18 February 2009, Martin Maurer wrote:
  I suppose no-one has any?

 VMware includes in its EULA (End User License Agreement) a prohibition for
 any licensee to publish benchmark results without VMware's approval. (see
 https://www.vmware.com/tryvmware/eula.php)

 Maybe this is a reason why all published VMWare benchmarks looks quite
 similar :-)

 I would love to see a comparison but due to this restrictions it´s hard to
 get independent results.

 Br, Martin


I hardly think it stops people from casually talking about their day to day 
experiences with vmware and how kvm matches up to it. And even if it did, it 
doesn't sound like something thats actually legally binding. Otherwise I can 
start putting things like YOU MUST NEVER TALK AGAIN in my eulas.

-- 
Thomas Fjellstrom
tfjellst...@shaw.ca
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: copyless virtio net thoughts?

2009-02-18 Thread Simon Horman
On Wed, Feb 18, 2009 at 10:08:00PM +1030, Rusty Russell wrote:
 
 2) Direct NIC attachment This is particularly interesting with SR-IOV or
 other multiqueue nics, but for boutique cases or benchmarks, could be for
 normal NICs.  So far I have some very sketched-out patches: for the
 attached nic dev_alloc_skb() gets an skb from the guest (which supplies
 them via some kind of AIO interface), and a branch in netif_receive_skb()
 which returned it to the guest.  This bypasses all firewalling in the
 host though; we're basically having the guest process drive the NIC
 directly.

Hi Rusty,

Can I clarify that the idea with utilising SR-IOV would be to assign
virtual functions to guests? That is, something conceptually similar to
PCI pass-through in Xen (although I'm not sure that anyone has virtual
function pass-through working yet). If so, wouldn't this also be useful
on machines that have multiple NICs?

-- 
Simon Horman
  VA Linux Systems Japan K.K., Sydney, Australia Satellite Office
  H: www.vergenet.net/~horms/ W: www.valinux.co.jp/en

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: copyless virtio net thoughts?

2009-02-18 Thread Dong, Eddie
Simon Horman wrote:
 On Wed, Feb 18, 2009 at 10:08:00PM +1030, Rusty Russell
 wrote: 
 
 2) Direct NIC attachment This is particularly
 interesting with SR-IOV or other multiqueue nics, but
 for boutique cases or benchmarks, could be for normal
 NICs.  So far I have some very sketched-out patches: for
 the attached nic dev_alloc_skb() gets an skb from the
 guest (which supplies them via some kind of AIO
 interface), and a branch in netif_receive_skb() which
 returned it to the guest.  This bypasses all firewalling
 in the host though; we're basically having the guest
 process drive the NIC directly.  
 
 Hi Rusty,
 
 Can I clarify that the idea with utilising SR-IOV would
 be to assign virtual functions to guests? That is,
 something conceptually similar to PCI pass-through in Xen
 (although I'm not sure that anyone has virtual function
 pass-through working yet). If so, wouldn't this also be
 useful on machines that have multiple NICs? 
 
Yes, and we have successfully get it run with assigning VF to guest in both Xen 
 KVM, but we are still working on pushing those patches out since it needs 
Linux PCI subsystem support  driver support.

Thx, eddie--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Running KVM on a Laptop

2009-02-18 Thread TJ
On Wed, 2009-02-18 at 08:45 +0100, Louis-David Mitterrand wrote:
  Is it not as simple as checking for the svm or vt flags?
 
 No, one must also check that the bios allows enabling virtualization
 support. My sony laptop has the right processor but no bios option.
 Check that first!

Louis-David, I just noticed your comment in passing and thought I'd let
you (and others with a Sony Vaio) know that it is possible to enable the
VT option in NVRAM even though the BIOS set-up menu doesn't support it.

I did it with this Vaio VGN-FE41Z with T7200 CPU back in mid 2007 and
not had to redo it since. I sometimes run several instances of KVM on
it.

The Phoenix BIOS does support storing in NVRAM and setting the VT-enable
bits using MSR 0x3A at boot time.

I was hoping to create a Linux tool to make the NVRAM change but due to:

 * each BIOS version uses a different token number to store the
VT-enable BIOS setting
 * to identify the token number you have to examine the BIOS executable
code
* currently the only 'safe' way to set the token in NVRAM is to use the
DOS symcmos.exe utility (from Phoenix)

A one-shot solution proved impractical so it is a case of doing it on a
per-BIOS-version basis.

If you want to email me off-list with the precise Sony model-number and
BIOS revision I should be able to help you enable the VT bit.

For some highly technical background see:

http://tjworld.net/wiki/Sony/Vaio/FE41Z/HackingBiosNvram

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 7/8] kvm: qemu: deassign device from guest

2009-02-18 Thread Han, Weidong
Marcelo Tosatti wrote:
 Weidong,
 
 Does this set fix
 
 http://sourceforge.net/tracker2/?func=detailaid=2432316group_id=180599atid=893831
 

I found above bug was already gone even without my patch. I guess it's fixed by 
Mark:

commit: 02874f4272b6787ff94ee7256ef083257b9d1eb1
Author: Mark McLoughlin mar...@redhat.com
Date:   Fri Nov 28 17:10:47 2008 +

kvm: qemu: device-assignment: free device if hotplug fails

Signed-off-by: Mark McLoughlin mar...@redhat.com
Signed-off-by: Avi Kivity a...@redhat.com


Actually, my patch just moves free_assigned_device into init_assigned_device, 
no functional change. But I updated the patch to also call free_assigned_device 
when pci_register_device fails in init_assigned_device, because adev is 
allocated by qemu_mallocz in add_assigned_device.

From ce48b0d6c636d8f49bc5977d1d144fa047273846 Mon Sep 17 00:00:00 2001
From: Weidong Han weidong@intel.com
Date: Thu, 19 Feb 2009 10:49:30 +0800
Subject: [PATCH] kvm: qemu: free device on error in init_assigned_device

make init_assigned_device call free_assigned_device on error,
and then make free_assigned_device is static because it's only
invoked in device-assigned.c.

Acked-by: Mark McLoughlin mar...@redhat.com
Signed-off-by: Weidong Han weidong@intel.com
---
 qemu/hw/device-assignment.c |   14 +-
 qemu/hw/device-assignment.h |1 -
 qemu/hw/pci-hotplug.c   |1 -
 3 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c
index e6d2352..0b96ee4 100644
--- a/qemu/hw/device-assignment.c
+++ b/qemu/hw/device-assignment.c
@@ -443,7 +443,7 @@ again:
 
 static LIST_HEAD(, AssignedDevInfo) adev_head;
 
-void free_assigned_device(AssignedDevInfo *adev)
+static void free_assigned_device(AssignedDevInfo *adev)
 {
 AssignedDevice *dev = adev-assigned_dev;
 
@@ -550,7 +550,7 @@ struct PCIDevice *init_assigned_device(AssignedDevInfo 
*adev, PCIBus *bus)
 if (NULL == dev) {
 fprintf(stderr, %s: Error: Couldn't register real device %s\n,
 __func__, adev-name);
-return NULL;
+goto out;
 }
 
 adev-assigned_dev = dev;
@@ -558,14 +558,14 @@ struct PCIDevice *init_assigned_device(AssignedDevInfo 
*adev, PCIBus *bus)
 if (get_real_device(dev, adev-bus, adev-dev, adev-func)) {
 fprintf(stderr, %s: Error: Couldn't get real device (%s)!\n,
 __func__, adev-name);
-return NULL;
+goto out;
 }
 
 /* handle real device's MMIO/PIO BARs */
 if (assigned_dev_register_regions(dev-real_device.regions,
   dev-real_device.region_number,
   dev))
-return NULL;
+goto out;
 
 /* handle interrupt routing */
 e_device = (dev-dev.devfn  3)  0x1f;
@@ -595,10 +595,14 @@ struct PCIDevice *init_assigned_device(AssignedDevInfo 
*adev, PCIBus *bus)
 if (r  0) {
fprintf(stderr, Failed to assign device \%s\ : %s\n,
 adev-name, strerror(-r));
-   return NULL;
+   goto out;
 }
 
 return dev-dev;
+
+out:
+free_assigned_device(adev);
+return NULL;
 }
 
 /*
diff --git a/qemu/hw/device-assignment.h b/qemu/hw/device-assignment.h
index f216bb0..6a9b9fa 100644
--- a/qemu/hw/device-assignment.h
+++ b/qemu/hw/device-assignment.h
@@ -94,7 +94,6 @@ struct AssignedDevInfo {
 int disable_iommu;
 };
 
-void free_assigned_device(AssignedDevInfo *adev);
 PCIDevice *init_assigned_device(AssignedDevInfo *adev, PCIBus *bus);
 AssignedDevInfo *add_assigned_device(const char *arg);
 void add_assigned_devices(PCIBus *bus, const char **devices, int n_devices);
diff --git a/qemu/hw/pci-hotplug.c b/qemu/hw/pci-hotplug.c
index 8c76453..65fafd1 100644
--- a/qemu/hw/pci-hotplug.c
+++ b/qemu/hw/pci-hotplug.c
@@ -143,7 +143,6 @@ static PCIDevice *qemu_pci_hot_assign_device(PCIBus 
*pci_bus, const char *opts)
 ret = init_assigned_device(adev, pci_bus);
 if (ret == NULL) {
 term_printf(Failed to assign device\n);
-free_assigned_device(adev);
 return NULL;
 }
 
-- 
1.6.0.4



 
 On Wed, Feb 18, 2009 at 03:13:05PM +0800, Han, Weidong wrote:
 free_assigned_device just frees device from qemu, it should also
 deassign the device from guest when guest exits or hot remove
 assigned device. 
 
 Acked-by: Mark McLoughlin mar...@redhat.com
 Signed-off-by: Weidong Han weidong@intel.com
 ---
  qemu/hw/device-assignment.c |   28 ++--
  qemu/hw/device-assignment.h |1 +
  2 files changed, 27 insertions(+), 2 deletions(-)



0003-kvm-qemu-free-device-on-error-in-init_assigned_dev-v2.patch
Description: 0003-kvm-qemu-free-device-on-error-in-init_assigned_dev-v2.patch


Re: With -vnc option, can I still use ctrl+alt + n?

2009-02-18 Thread Tomasz Chmielewski

Neo Jia schrieb:

hi,

I am trying kvm-84 and with -vnc option I can't use ctrl + alt + n
key to get the qemu system console. Is there anyway to make this work?


Use Qemu/KVM monitor and it's sendkey function.

For example:

sendkey alt-f3


--
Tomasz Chmielewski
http://wpkg.org
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kvm on G4 processors?

2009-02-18 Thread Jimi Xenidis


On Feb 18, 2009, at 3:21 AM, Alexander Graf wrote:



On 17.02.2009, at 09:32, Liu Yu-B13201 yu@freescale.com wrote:





-Original Message-
From: kvm-ppc-ow...@vger.kernel.org
[mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Roberto  
Innocenti

Sent: Tuesday, February 17, 2009 4:26 PM
To: kvm-ppc@vger.kernel.org
Subject: kvm on G4 processors?

I have tried to compile kernel 2.6.27 with kvm support on my  
PowerBook
G4, but kvm option is not visible becouse kernel menu config  
permit to

compile kvm kernel module only if you ave a PowerPC 440 architecture
and not G4.
But really kvm doesn't work for G4 processors, it's so different the
architeture ?
In case kvm is working how to compile the kernel module on my G4?



I'm afraid that KVM now doesnot support G4.
440 belongs to BOOKE architecture, which is much different from G4.


We are at the begonning of porting kvm to 970(fx) atm, which is a  
lot closer to a g4 than any booke.


Alex, which deployment of the 970 are you targeting:
1) IBM JS21/22 blades, that actually have a hypervisor already present
2) Apple G5, Bare metal, but has most hypervisor features physically  
disabled
3) Any non-Book3E, which we call classic like 604, 750... and/or  
Book3S, G3, G4, G5, P3, P4


If you choose (1), then your work would be harder but it should apply  
to any IBM PPC64 or pSeries product
If you choose (2), then your work could be much easier, but it would  
apply to G5s only.

if you choose (3), then its about the same as (2).

Another question is, when you do create your virtual machine, do you  
intend for it to look exactly like a G5 machine (and support an  
unmodified MacOS), a pSeries Machine (and emulate the pSeries  
Hypervisor), or some new Machine that will require further  
modifications to the OSes you will support?


BTW: I do not intend to discourage, and would be thrilled to see _any_  
of the above explored.

-JX





For now, there's no usable code yet though.

Alex



--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kvm on G4 processors?

2009-02-18 Thread Alexander Graf


On 18.02.2009, at 13:19, Jimi Xenidis wrote:



On Feb 18, 2009, at 3:21 AM, Alexander Graf wrote:



On 17.02.2009, at 09:32, Liu Yu-B13201 yu@freescale.com  
wrote:






-Original Message-
From: kvm-ppc-ow...@vger.kernel.org
[mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Roberto  
Innocenti

Sent: Tuesday, February 17, 2009 4:26 PM
To: kvm-ppc@vger.kernel.org
Subject: kvm on G4 processors?

I have tried to compile kernel 2.6.27 with kvm support on my  
PowerBook
G4, but kvm option is not visible becouse kernel menu config  
permit to
compile kvm kernel module only if you ave a PowerPC 440  
architecture

and not G4.
But really kvm doesn't work for G4 processors, it's so different  
the

architeture ?
In case kvm is working how to compile the kernel module on my G4?



I'm afraid that KVM now doesnot support G4.
440 belongs to BOOKE architecture, which is much different from G4.


We are at the begonning of porting kvm to 970(fx) atm, which is a  
lot closer to a g4 than any booke.


Alex, which deployment of the 970 are you targeting:
1) IBM JS21/22 blades, that actually have a hypervisor already present
2) Apple G5, Bare metal, but has most hypervisor features physically  
disabled
3) Any non-Book3E, which we call classic like 604, 750... and/or  
Book3S, G3, G4, G5, P3, P4


If you choose (1), then your work would be harder but it should  
apply to any IBM PPC64 or pSeries product
If you choose (2), then your work could be much easier, but it would  
apply to G5s only.

if you choose (3), then its about the same as (2).


Right now we're targeting the PS3, as that's the platform we have most  
free machines of here ;-). But the code as is should work for any bare  
metal 970.
I haven't really looked into the hypervisor bits yet, but targeting  
iSeries is definitely on the list. AFAIK we only need to take a deeper  
look at that when we get to implement the MMU bits.


Another question is, when you do create your virtual machine, do you  
intend for it to look exactly like a G5 machine (and support an  
unmodified MacOS), a pSeries Machine (and emulate the pSeries  
Hypervisor), or some new Machine that will require further  
modifications to the OSes you will support?


I thought pSeries were the ones without Hypervisor? Basically the idea  
is to expose a random bare-metal CPU to the userspace, with qemu  
implementing the rest. One thing I was thinking of was even to go as  
far as implementing a G3 guest on a POWER4+ host, but for now the plan  
is 970 on 970.


Alex

BTW: I do not intend to discourage, and would be thrilled to see  
_any_ of the above explored.

-JX



--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kvm on G4 processors?

2009-02-18 Thread Alexander Graf


On 18.02.2009, at 14:10, Jimi Xenidis wrote:



On Feb 18, 2009, at 6:53 AM, Alexander Graf wrote:



On 18.02.2009, at 13:19, Jimi Xenidis wrote:



On Feb 18, 2009, at 3:21 AM, Alexander Graf wrote:



On 17.02.2009, at 09:32, Liu Yu-B13201 yu@freescale.com  
wrote:






-Original Message-
From: kvm-ppc-ow...@vger.kernel.org
[mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Roberto  
Innocenti

Sent: Tuesday, February 17, 2009 4:26 PM
To: kvm-ppc@vger.kernel.org
Subject: kvm on G4 processors?

I have tried to compile kernel 2.6.27 with kvm support on my  
PowerBook
G4, but kvm option is not visible becouse kernel menu config  
permit to
compile kvm kernel module only if you ave a PowerPC 440  
architecture

and not G4.
But really kvm doesn't work for G4 processors, it's so  
different the

architeture ?
In case kvm is working how to compile the kernel module on my G4?



I'm afraid that KVM now doesnot support G4.
440 belongs to BOOKE architecture, which is much different from  
G4.


We are at the begonning of porting kvm to 970(fx) atm, which is a  
lot closer to a g4 than any booke.


Alex, which deployment of the 970 are you targeting:
1) IBM JS21/22 blades, that actually have a hypervisor already  
present
2) Apple G5, Bare metal, but has most hypervisor features  
physically disabled
3) Any non-Book3E, which we call classic like 604, 750... and/or  
Book3S, G3, G4, G5, P3, P4


If you choose (1), then your work would be harder but it should  
apply to any IBM PPC64 or pSeries product
If you choose (2), then your work could be much easier, but it  
would apply to G5s only.

if you choose (3), then its about the same as (2).


Right now we're targeting the PS3, as that's the platform we have  
most free machines of here ;-).


Do you mean Cell blade, or a PS3?


Currently PS3. Though I did test stuff on a 970 PowerStation and a  
QS22 in parallel.



But the code as is should work for any bare metal 970.


PS3s come with Sony's Hypervisor which is different the the pSeries  
Hypervisor and far different from a bare metal 970, which only apple  
G5s qualify for that name.


If your intention is to work entirely above the PPC abstracted Linux  
environment then that should be interesting.


I don't really see how we need to work around anything. Basically the  
guest in these hypervisors still sees things as if they were bare  
metal, no?


I haven't really looked into the hypervisor bits yet, but targeting  
iSeries is definitely on the list.


This has little to do with iSeries LPAR and to do with the  
Hypervisor introduced to all pSeries product on IBM 970 and P5 and  
beyond.


Hm - no idea on that one. I haven't really looked into all possible  
combinations yet. But so far our code doesn't do too much different  
from a real OS supervisor-unprivileged context switch.


AFAIK we only need to take a deeper look at that when we get to  
implement the MMU bits.


I expect exception handlers to be your firs big worry.


Yes. We're at that right now. Actually hijacking the host's handlers  
does work for most cases already, jumping into the guest worked too  
and jumping out is what we're at atm.


Alex


The MMU will _indeed_ be interesting.




Another question is, when you do create your virtual machine, do  
you intend for it to look exactly like a G5 machine (and support  
an unmodified MacOS), a pSeries Machine (and emulate the pSeries  
Hypervisor), or some new Machine that will require further  
modifications to the OSes you will support?


I thought pSeries were the ones without Hypervisor?


As of 970 and P5, _everything_ produces has a hypervisor present  
regardless if it supports multiple LPARs or not.

This is also the case with Sony's PS/3.

Basically the idea is to expose a random bare-metal CPU to the  
userspace, with qemu implementing the rest. One thing I was  
thinking of was even to go as far as implementing a G3 guest on a  
POWER4+ host, but for now the plan is 970 on 970.




970 on 970 should work nicely and if you restrict yourself to the  
bsic architecture then what you do should work well on anything.


-JX



Alex

BTW: I do not intend to discourage, and would be thrilled to see  
_any_ of the above explored.

-JX







--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html