Re: Convert KVM to VirtualBox

2010-02-04 Thread satimis

Hi Alex,

Thanks for your advice.


Just qemu-img convert the image to raw and use that as disk image in
vbox. Make sure you configure your vm to the same hw as on kvm (apic,
piix3, etc)


Could you please explain in more detail ... configure your vm to the  
same hw as on kvm (apic, piix3, etc)?  The image file of vm is  
*.qcow2.  I'll run VirtualBox on KVM if possible.  The former is  
desktop virtualization.  Ah a further thought can convert vm across  
PCs?  That is VirtualBox is running on another PC.




But seriously, why would anyone want to go this direction?


I can't get sound work on vm.  Therefore I expect trying VirturlBox to  
check whether it can sort out this problem.  I'll run VirtualBox on  
KVM if possible.  The former is desktop virtualization.



B.R.
Stephen



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Convert KVM to VirtualBox

2010-02-04 Thread Alexander Graf
sati...@pacific.net.hk wrote:
 Hi Alex,

 Thanks for your advice.

 Just qemu-img convert the image to raw and use that as disk image in
 vbox. Make sure you configure your vm to the same hw as on kvm (apic,
 piix3, etc)

 Could you please explain in more detail ... configure your vm to the
 same hw as on kvm (apic, piix3, etc)?  The image file of vm is
 *.qcow2.  I'll run VirtualBox on KVM if possible.  The former is
 desktop virtualization.  Ah a further thought can convert vm across
 PCs?  That is VirtualBox is running on another PC.

VBox doesn't work with KVM. They're basically rivaling products. Though
I'd love to see VBox adapt to make use of the KVM kernel module.

I'm sorry I can't help you on the configuration bits any more. The only
time I ever used VBox was to do a patch that allows VBox and KVM to
coexist. Just go through the virtual machine configuration things and
try out to change settings until it works.


 But seriously, why would anyone want to go this direction?

 I can't get sound work on vm.  Therefore I expect trying VirturlBox to
 check whether it can sort out this problem.  I'll run VirtualBox on
 KVM if possible.  The former is desktop virtualization.

Nope, not possible atm.

Alex
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Convert KVM to VirtualBox

2010-02-04 Thread Gildas Le Nadan

Alexander Graf wrote:

But seriously, why would anyone want to go this direction?

Alex


Hi Alex

Last time I checked the advantages of VirtualBox vs KVM were (for the 
technical part):

- sata support
- usb2 support
- audio hd support
- rdp/rdp+usb support
- a somewhat simpler network configuration method
and, icing on the cake, they have a CLI/API to ease 
administration/integration.


(Yes most of those are in the PUEL version, but still, it works and is 
free as in free beer)


Cheers,
Gildas
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] KVM: Rework of guest debug state writing

2010-02-04 Thread Marcelo Tosatti
On Thu, Feb 04, 2010 at 01:33:50AM +0100, Jan Kiszka wrote:
 Marcelo Tosatti wrote:
  On Wed, Feb 03, 2010 at 10:29:45PM +0100, Jan Kiszka wrote:
  So far we synchronized any dirty VCPU state back into the kernel before
  updating the guest debug state. This was a tribute to a deficit in x86
  kernels before 2.6.33. But as this is an arch-dependent issue, it is
  better handle in the x86 part of KVM and remove the writeback point for
  generic code.
  
  Jan,
  
  This patch breaks migration.
 
 Can you elaborate what you did? I can't reproduce, and I do not see any
 conceptual issue (given that guest debugging conflicts with migration
 anyway).

kvm-autotest fails (migration only, install is ok, both Linux and Win
guests). Not sure why, perhaps the unconditional KVM_SET_GUEST_DEBUG
corrupts state somehow? 

Tested with io thread enabled.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/8] KVM: Activate fpu on clts

2010-02-04 Thread Avi Kivity

On 02/02/2010 10:16 AM, Paolo Bonzini wrote:

On 01/21/2010 02:31 PM, Avi Kivity wrote:

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index feca59f..09207ba 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3266,6 +3266,7 @@ int emulate_invlpg(struct kvm_vcpu *vcpu, gva_t 
address)

  int emulate_clts(struct kvm_vcpu *vcpu)
  {
  kvm_x86_ops-set_cr0(vcpu, kvm_read_cr0_bits(vcpu, ~X86_CR0_TS));
+kvm_x86_ops-fpu_activate(vcpu);
  return X86EMUL_CONTINUE;
  }


Can this code be reached if CLTS is executed in real mode?  That would 
cause a NULL-pointer access on VMX.


How would this cause a null pointer access?

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/8] KVM: Activate fpu on clts

2010-02-04 Thread Gleb Natapov
On Thu, Feb 04, 2010 at 03:05:17PM +0200, Avi Kivity wrote:
 On 02/02/2010 10:16 AM, Paolo Bonzini wrote:
 On 01/21/2010 02:31 PM, Avi Kivity wrote:
 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
 index feca59f..09207ba 100644
 --- a/arch/x86/kvm/x86.c
 +++ b/arch/x86/kvm/x86.c
 @@ -3266,6 +3266,7 @@ int emulate_invlpg(struct kvm_vcpu *vcpu,
 gva_t address)
   int emulate_clts(struct kvm_vcpu *vcpu)
   {
   kvm_x86_ops-set_cr0(vcpu, kvm_read_cr0_bits(vcpu, ~X86_CR0_TS));
 +kvm_x86_ops-fpu_activate(vcpu);
   return X86EMUL_CONTINUE;
   }
 
 Can this code be reached if CLTS is executed in real mode?  That
 would cause a NULL-pointer access on VMX.
 
 How would this cause a null pointer access?
 
vmx.c doesn't initialize kvm_x86_ops-fpu_activate as far as I see.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] emulate accessed bit for EPT

2010-02-04 Thread Rik van Riel

On 02/03/2010 11:12 PM, Balbir Singh wrote:

* Rik van Rielr...@redhat.com  [2010-02-03 16:11:03]:


Currently KVM pretends that pages with EPT mappings never got
accessed.  This has some side effects in the VM, like swapping
out actively used guest pages and needlessly breaking up actively
used hugepages.

We can avoid those very costly side effects by emulating the
accessed bit for EPT PTEs, which should only be slightly costly
because pages pass through page_referenced infrequently.



Quite a clever implementation, one side effect is that one would see a
larger number of minor faults with EPT enabled and an increase in
allocation/frees of rmap entries, but that can be easily explained.


I suspect it won't be very many. I have been monitoring
/proc/meminfo on my system while testing this patch, and
it is quite typical that the size of the inactive anon
list does not change for minutes at a time.

In other words, no pages are moved onto or off of the
inactive anon list for several minutes. That corresponds
to a very small number of minor faults introduced by my
patch.

Of course, when the system is swapping, we will have more
minor faults.  However, minor faults should be less of a
performance issue than major faults :)

--
All rights reversed.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Convert KVM to VirtualBox

2010-02-04 Thread Yann Hamon

 Hi folks,
Hello Satimis, 


 I need converting KVM (qcow2) to VirtualBox (vdi) from one PC to  
 another PC. 

I had to do this recently to run a KVM virtual machine on a Windows PC with 
virtualbox.

 Do you have any idea/suggestion where can I find relevant
 documentation?

Although you can convert to RAW and then to vdi with VBoxManage, I found it 
easier to just use vmware's VMDK format, which is very well used by both vmware 
and virtualbox - and you can use qemu-img to convert from one to the other.

Good luck


--
Files attached to this email may be in ISO 26300 format (OASIS Open Document 
Format). If you have difficulty opening them, please visit http://iso26300.info 
for more information.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Seabios incompatible with Linux 2.6.26 host?

2010-02-04 Thread Pierre Riteau
Hello,
I'm having trouble running the latest qemu-kvm code on Debian Lenny (Linux 
2.6.26).
qemu-kvm dies with an error like this one:
exception 13 (0)
rax 0010 rbx 8c00 rcx 6ebe rdx 
000c8c00
rsi e201 rdi 000c rsp 6eb4 rbp 
e201
r8   r9   r10  r11 

r12  r13  r14  r15 

rip 000fdeb0 rflags 00033002
cs  (/ p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
ds  (/ p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
es  (/ p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
ss  (/ p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
fs  (/ p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
gs  (/ p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
tr  (feffd000/2088 p 1 dpl 0 db 0 s 0 type b l 0 g 0 avl 0)
ldt  (/ p 1 dpl 0 db 0 s 0 type 2 l 0 g 0 avl 0)
gdt f7a20/37
idt f8aa0/0
cr0 10 cr2 0 cr3 0 cr4 0 cr8 0 efer 0
code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 -- 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

I think I traced back the issue to the switch from Bochs BIOS to Seabios. By 
forcing the usage of Bochs BIOS 5f08bb45861f54be478b25075b90d2406a0f8bb3 works, 
while it dies without the -bios override.
Unfortunately, newer versions don't seem to work with Bochs BIOS.

Upgrading the host kernel to 2.6.32 (Debian Squeeze) solves the issue. No 
problem on Fedora 12 as well.

-- 
Pierre Riteau -- PhD student, Myriads team, IRISA, Rennes, France
http://perso.univ-rennes1.fr/pierre.riteau/

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] KVM: Rework of guest debug state writing

2010-02-04 Thread Jan Kiszka
Marcelo Tosatti wrote:
 On Thu, Feb 04, 2010 at 01:33:50AM +0100, Jan Kiszka wrote:
 Marcelo Tosatti wrote:
 On Wed, Feb 03, 2010 at 10:29:45PM +0100, Jan Kiszka wrote:
 So far we synchronized any dirty VCPU state back into the kernel before
 updating the guest debug state. This was a tribute to a deficit in x86
 kernels before 2.6.33. But as this is an arch-dependent issue, it is
 better handle in the x86 part of KVM and remove the writeback point for
 generic code.
 Jan,

 This patch breaks migration.
 Can you elaborate what you did? I can't reproduce, and I do not see any
 conceptual issue (given that guest debugging conflicts with migration
 anyway).
 
 kvm-autotest fails (migration only, install is ok, both Linux and Win
 guests). Not sure why, perhaps the unconditional KVM_SET_GUEST_DEBUG
 corrupts state somehow? 
 
 Tested with io thread enabled.

That's this default-off thing, so... OK, confirmed, investigating.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] emulate accessed bit for EPT

2010-02-04 Thread Balbir Singh
* Rik van Riel r...@redhat.com [2010-02-04 08:40:43]:

 On 02/03/2010 11:12 PM, Balbir Singh wrote:
 * Rik van Rielr...@redhat.com  [2010-02-03 16:11:03]:
 
 Currently KVM pretends that pages with EPT mappings never got
 accessed.  This has some side effects in the VM, like swapping
 out actively used guest pages and needlessly breaking up actively
 used hugepages.
 
 We can avoid those very costly side effects by emulating the
 accessed bit for EPT PTEs, which should only be slightly costly
 because pages pass through page_referenced infrequently.
 
 Quite a clever implementation, one side effect is that one would see a
 larger number of minor faults with EPT enabled and an increase in
 allocation/frees of rmap entries, but that can be easily explained.
 
 I suspect it won't be very many. I have been monitoring
 /proc/meminfo on my system while testing this patch, and
 it is quite typical that the size of the inactive anon
 list does not change for minutes at a time.
 
 In other words, no pages are moved onto or off of the
 inactive anon list for several minutes. That corresponds
 to a very small number of minor faults introduced by my
 patch.
 
 Of course, when the system is swapping, we will have more
 minor faults.  However, minor faults should be less of a
 performance issue than major faults :)


I do agree with you. 

-- 
Balbir
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 00/20] qemu-kvm: vhost net port

2010-02-04 Thread Michael S. Tsirkin
This is port of vhost v1 patch set I posted previously to qemu-kvm, for
those that want to get good performance out of it :)
This includes irqchip support and merge fixup on top of upstream patch.

Michael S. Tsirkin (20):
  exec: memory notifiers
  kvm: move kvm_set_phys_mem around
  kvm: move kvm to use memory notifiers
  qemu-kvm: fixup after merging memory notifiers
  kvm: add API to set ioeventfd
  notifier: event notifier implementation
  virtio: add notifier support
  virtio: add APIs for queue fields
  virtio: add status change callback
  virtio: move typedef to qemu-common
  virtio-pci: fill in notifier support
  tap: add interface to get device fd
  vhost: vhost net support
  tap: add vhost/vhostfd options
  tap: add API to retrieve vhost net header
  virtio-net: vhost net support
  qemu-kvm: add vhost.h header
  kvm: irqfd support
  msix: add mask/unmask notifiers
  virtio-pci: irqfd support

 Makefile.target   |2 +
 cpu-common.h  |   19 ++
 exec.c|  111 -
 hw/msix.c |   36 +++-
 hw/msix.h |1 +
 hw/notifier.c |   50 
 hw/notifier.h |   16 ++
 hw/pci.h  |6 +
 hw/s390-virtio-bus.c  |3 +
 hw/syborg_virtio.c|2 +
 hw/vhost.c|  603 +
 hw/vhost.h|   44 
 hw/vhost_net.c|  147 +++
 hw/vhost_net.h|   20 ++
 hw/virtio-net.c   |   67 +-
 hw/virtio-pci.c   |   95 +++
 hw/virtio.c   |   52 -
 hw/virtio.h   |   15 +-
 kvm-all.c |  353 ---
 kvm.h |   34 ++-
 kvm/include/linux/vhost.h |  130 ++
 net.c |8 +
 net/tap.c |   43 
 net/tap.h |5 +
 qemu-common.h |2 +
 qemu-kvm.c|1 +
 qemu-options.hx   |4 +-
 27 files changed, 1704 insertions(+), 165 deletions(-)
 create mode 100644 hw/notifier.c
 create mode 100644 hw/notifier.h
 create mode 100644 hw/vhost.c
 create mode 100644 hw/vhost.h
 create mode 100644 hw/vhost_net.c
 create mode 100644 hw/vhost_net.h
 create mode 100644 kvm/include/linux/vhost.h
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 04/20] qemu-kvm: fixup after merging memory notifiers

2010-02-04 Thread Michael S. Tsirkin
qemu-kvm.c must register notifier as well

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 kvm-all.c  |4 
 qemu-kvm.c |1 +
 2 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index f31585e..51273e4 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -536,6 +536,8 @@ void kvm_set_phys_mem(target_phys_addr_t start_addr,
 }
 }
 
+#endif
+
 static void kvm_client_set_memory(struct CPUPhysMemoryClient *client,
  target_phys_addr_t start_addr,
  ram_addr_t size,
@@ -563,6 +565,8 @@ static CPUPhysMemoryClient kvm_cpu_phys_memory_client = {
.migration_log = kvm_client_migration_log,
 };
 
+#ifdef KVM_UPSTREAM
+
 int kvm_init(int smp_cpus)
 {
 static const char upgrade_note[] =
diff --git a/qemu-kvm.c b/qemu-kvm.c
index a305907..f7b2dda 100644
--- a/qemu-kvm.c
+++ b/qemu-kvm.c
@@ -406,6 +406,7 @@ int kvm_init(int smp_cpus)
 for (i = gsi_count; i  gsi_bits; i++)
 set_gsi(kvm_context, i);
 }
+cpu_register_phys_memory_client(kvm_cpu_phys_memory_client);
 
 pthread_mutex_lock(qemu_mutex);
 return kvm_create_context();
-- 
1.6.6.144.g5c3af

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 05/20] kvm: add API to set ioeventfd

2010-02-04 Thread Michael S. Tsirkin
This adds API to set ioeventfd to kvm,
as well as stubs for non-eventfd case,
making it possible for users to use this API
without ifdefs.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 kvm-all.c |   20 
 kvm.h |   16 
 2 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index 51273e4..6dbe480 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1134,4 +1134,24 @@ void kvm_remove_all_breakpoints(CPUState *current_env)
 }
 #endif /* !KVM_CAP_SET_GUEST_DEBUG */
 
+#ifdef KVM_IOEVENTFD
+int kvm_set_ioeventfd(uint16_t addr, uint16_t data, int fd, bool assigned)
+{
+struct kvm_ioeventfd kick = {
+.datamatch = data,
+.addr = addr,
+.len = 2,
+.flags = KVM_IOEVENTFD_FLAG_DATAMATCH | KVM_IOEVENTFD_FLAG_PIO,
+.fd = fd,
+};
+int r;
+if (!assigned)
+kick.flags |= KVM_IOEVENTFD_FLAG_DEASSIGN;
+r = kvm_vm_ioctl(kvm_state, KVM_IOEVENTFD, kick);
+if (r  0)
+return r;
+return 0;
+}
+#endif
+
 #include qemu-kvm.c
diff --git a/kvm.h b/kvm.h
index 7b33177..365d8b1 100644
--- a/kvm.h
+++ b/kvm.h
@@ -14,6 +14,8 @@
 #ifndef QEMU_KVM_H
 #define QEMU_KVM_H
 
+#include stdbool.h
+#include errno.h
 #include config.h
 #include qemu-queue.h
 #include qemu-kvm.h
@@ -21,6 +23,10 @@
 #ifdef KVM_UPSTREAM
 
 #ifdef CONFIG_KVM
+#include linux/kvm.h
+#endif
+
+#ifdef CONFIG_KVM
 extern int kvm_allowed;
 
 #define kvm_enabled() (kvm_allowed)
@@ -143,4 +149,14 @@ static inline void cpu_synchronize_state(CPUState *env)
 
 #endif
 
+#if defined(KVM_IOEVENTFD)  defined(CONFIG_KVM)
+int kvm_set_ioeventfd(uint16_t addr, uint16_t data, int fd, bool assigned);
+#else
+static inline
+int kvm_set_ioeventfd(uint16_t data, uint16_t addr, int fd, bool assigned)
+{
+return -ENOSYS;
+}
+#endif
+
 #endif
-- 
1.6.6.144.g5c3af

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 06/20] notifier: event notifier implementation

2010-02-04 Thread Michael S. Tsirkin
event notifiers are slightly generalized eventfd descriptors. Current
implementation depends on eventfd because vhost is the only user, and
vhost depends on eventfd anyway, but a stub is provided for non-eventfd
case.

We'll be able to further generalize this when another user comes along
and we see how to best do this.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 Makefile.target |1 +
 hw/notifier.c   |   50 ++
 hw/notifier.h   |   16 
 qemu-common.h   |1 +
 4 files changed, 68 insertions(+), 0 deletions(-)
 create mode 100644 hw/notifier.c
 create mode 100644 hw/notifier.h

diff --git a/Makefile.target b/Makefile.target
index 0ddd2a0..7df9535 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -183,6 +183,7 @@ obj-y = vl.o async.o monitor.o pci.o pci_host.o pcie_host.o 
machine.o gdbstub.o
 # virtio has to be here due to weird dependency between PCI and virtio-net.
 # need to fix this properly
 obj-y += virtio-blk.o virtio-balloon.o virtio-net.o virtio-pci.o 
virtio-serial-bus.o
+obj-y += notifier.o
 obj-$(CONFIG_KVM) += kvm.o kvm-all.o
 # MSI-X depends on kvm for interrupt injection,
 # so moved it from Makefile.objs to Makefile.target for now
diff --git a/hw/notifier.c b/hw/notifier.c
new file mode 100644
index 000..dff38de
--- /dev/null
+++ b/hw/notifier.c
@@ -0,0 +1,50 @@
+#include hw.h
+#include notifier.h
+#ifdef CONFIG_EVENTFD
+#include sys/eventfd.h
+#endif
+
+int event_notifier_init(EventNotifier *e, int active)
+{
+#ifdef CONFIG_EVENTFD
+   int fd = eventfd(!!active, EFD_NONBLOCK | EFD_CLOEXEC);
+   if (fd  0)
+   return -errno;
+   e-fd = fd;
+   return 0;
+#else
+   return -ENOSYS;
+#endif
+}
+
+void event_notifier_cleanup(EventNotifier *e)
+{
+   close(e-fd);
+}
+
+int event_notifier_get_fd(EventNotifier *e)
+{
+   return e-fd;
+}
+
+int event_notifier_test_and_clear(EventNotifier *e)
+{
+   uint64_t value;
+   int r = read(e-fd, value, sizeof value);
+   return r == sizeof value;
+}
+
+int event_notifier_test(EventNotifier *e)
+{
+   uint64_t value;
+   int r = read(e-fd, value, sizeof value);
+   if (r == sizeof value) {
+   /* restore previous value. */
+   int s = write(e-fd, value, sizeof value);
+   /* never blocks because we use EFD_SEMAPHORE.
+* If we didn't we'd get EAGAIN on overflow
+* and we'd have to write code to ignore it. */
+   assert(s == sizeof value);
+   }
+   return r == sizeof value;
+}
diff --git a/hw/notifier.h b/hw/notifier.h
new file mode 100644
index 000..24117ea
--- /dev/null
+++ b/hw/notifier.h
@@ -0,0 +1,16 @@
+#ifndef QEMU_EVENT_NOTIFIER_H
+#define QEMU_EVENT_NOTIFIER_H
+
+#include qemu-common.h
+
+struct EventNotifier {
+   int fd;
+};
+
+int event_notifier_init(EventNotifier *, int active);
+void event_notifier_cleanup(EventNotifier *);
+int event_notifier_get_fd(EventNotifier *);
+int event_notifier_test_and_clear(EventNotifier *);
+int event_notifier_test(EventNotifier *);
+
+#endif
diff --git a/qemu-common.h b/qemu-common.h
index bf14a22..5e935d1 100644
--- a/qemu-common.h
+++ b/qemu-common.h
@@ -224,6 +224,7 @@ typedef struct uWireSlave uWireSlave;
 typedef struct I2SCodec I2SCodec;
 typedef struct DeviceState DeviceState;
 typedef struct SSIBus SSIBus;
+typedef struct EventNotifier EventNotifier;
 
 /* CPU save/load.  */
 void cpu_save(QEMUFile *f, void *opaque);
-- 
1.6.6.144.g5c3af

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 08/20] virtio: add APIs for queue fields

2010-02-04 Thread Michael S. Tsirkin
vhost needs physical addresses for ring and other queue fields,
so add APIs for these.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 hw/virtio.c |   51 +++
 hw/virtio.h |   10 +-
 2 files changed, 56 insertions(+), 5 deletions(-)

diff --git a/hw/virtio.c b/hw/virtio.c
index b9411e9..65e59c1 100644
--- a/hw/virtio.c
+++ b/hw/virtio.c
@@ -73,6 +73,9 @@ struct VirtQueue
 int inuse;
 uint16_t vector;
 void (*handle_output)(VirtIODevice *vdev, VirtQueue *vq);
+VirtIODevice *vdev;
+EventNotifier guest_notifier;
+EventNotifier host_notifier;
 };
 
 /* virt queue functions */
@@ -592,10 +595,10 @@ VirtQueue *virtio_add_queue(VirtIODevice *vdev, int 
queue_size,
 return vdev-vq[i];
 }
 
-void virtio_irq(VirtIODevice *vdev, VirtQueue *vq)
+void virtio_irq(VirtQueue *vq)
 {
-vdev-isr |= 0x01;
-virtio_notify_vector(vdev, vq-vector);
+vq-vdev-isr |= 0x01;
+virtio_notify_vector(vq-vdev, vq-vector);
 }
 
 void virtio_notify(VirtIODevice *vdev, VirtQueue *vq)
@@ -606,7 +609,8 @@ void virtio_notify(VirtIODevice *vdev, VirtQueue *vq)
  (vq-inuse || vring_avail_idx(vq) != vq-last_avail_idx)))
 return;
 
-virtio_irq(vdev, vq);
+vdev-isr |= 0x01;
+virtio_notify_vector(vdev, vq-vector);
 }
 
 void virtio_notify_config(VirtIODevice *vdev)
@@ -740,3 +744,42 @@ void virtio_bind_device(VirtIODevice *vdev, const 
VirtIOBindings *binding,
 vdev-binding = binding;
 vdev-binding_opaque = opaque;
 }
+
+target_phys_addr_t virtio_queue_get_desc(VirtIODevice *vdev, int n)
+{
+   return vdev-vq[n].vring.desc;
+}
+
+target_phys_addr_t virtio_queue_get_avail(VirtIODevice *vdev, int n)
+{
+   return vdev-vq[n].vring.avail;
+}
+
+target_phys_addr_t virtio_queue_get_used(VirtIODevice *vdev, int n)
+{
+   return vdev-vq[n].vring.used;
+}
+
+uint16_t virtio_queue_last_avail_idx(VirtIODevice *vdev, int n)
+{
+   return vdev-vq[n].last_avail_idx;
+}
+
+void virtio_queue_set_last_avail_idx(VirtIODevice *vdev, int n, uint16_t idx)
+{
+   vdev-vq[n].last_avail_idx = idx;
+}
+
+VirtQueue *virtio_queue(VirtIODevice *vdev, int n)
+{
+   return vdev-vq + n;
+}
+
+EventNotifier *virtio_queue_guest_notifier(VirtQueue *vq)
+{
+   return vq-guest_notifier;
+}
+EventNotifier *virtio_queue_host_notifier(VirtQueue *vq)
+{
+   return vq-host_notifier;
+}
diff --git a/hw/virtio.h b/hw/virtio.h
index 2c298a8..92ad5d1 100644
--- a/hw/virtio.h
+++ b/hw/virtio.h
@@ -183,5 +183,13 @@ void virtio_net_exit(VirtIODevice *vdev);
DEFINE_PROP_BIT(indirect_desc, _state, _field, \
VIRTIO_RING_F_INDIRECT_DESC, true)
 
-void virtio_irq(VirtIODevice *vdev, VirtQueue *vq);
+target_phys_addr_t virtio_queue_get_desc(VirtIODevice *vdev, int n);
+target_phys_addr_t virtio_queue_get_avail(VirtIODevice *vdev, int n);
+target_phys_addr_t virtio_queue_get_used(VirtIODevice *vdev, int n);
+uint16_t virtio_queue_last_avail_idx(VirtIODevice *vdev, int n);
+void virtio_queue_set_last_avail_idx(VirtIODevice *vdev, int n, uint16_t idx);
+VirtQueue *virtio_queue(VirtIODevice *vdev, int n);
+EventNotifier *virtio_queue_guest_notifier(VirtQueue *vq);
+EventNotifier *virtio_queue_host_notifier(VirtQueue *vq);
+void virtio_irq(VirtQueue *vq);
 #endif
-- 
1.6.6.144.g5c3af

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 09/20] virtio: add status change callback

2010-02-04 Thread Michael S. Tsirkin
vhost net backend needs to be notified when
frontend status changes. Add a callback.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 hw/s390-virtio-bus.c |3 +++
 hw/syborg_virtio.c   |2 ++
 hw/virtio-pci.c  |6 ++
 hw/virtio.h  |1 +
 4 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/hw/s390-virtio-bus.c b/hw/s390-virtio-bus.c
index 6b6dafc..a4ce734 100644
--- a/hw/s390-virtio-bus.c
+++ b/hw/s390-virtio-bus.c
@@ -243,6 +243,9 @@ void s390_virtio_device_update_status(VirtIOS390Device *dev)
 uint32_t features;
 
 vdev-status = ldub_phys(dev-dev_offs + VIRTIO_DEV_OFFS_STATUS);
+if (vdev-set_status) {
+vdev-set_status(vdev);
+}
 
 /* Update guest supported feature bitmap */
 
diff --git a/hw/syborg_virtio.c b/hw/syborg_virtio.c
index 65239a0..19f6473 100644
--- a/hw/syborg_virtio.c
+++ b/hw/syborg_virtio.c
@@ -152,6 +152,8 @@ static void syborg_virtio_writel(void *opaque, 
target_phys_addr_t offset,
 vdev-status = value  0xFF;
 if (vdev-status == 0)
 virtio_reset(vdev);
+if (vdev-set_status)
+vdev-set_status(vdev);
 break;
 case SYBORG_VIRTIO_INT_ENABLE:
 s-int_enable = value;
diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index 709d13e..dbb0b16 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -210,6 +210,9 @@ static void virtio_ioport_write(void *opaque, uint32_t 
addr, uint32_t val)
 virtio_reset(proxy-vdev);
 msix_unuse_all_vectors(proxy-pci_dev);
 }
+if (vdev-set_status) {
+vdev-set_status(vdev);
+}
 break;
 case VIRTIO_MSI_CONFIG_VECTOR:
 msix_vector_unuse(proxy-pci_dev, vdev-config_vector);
@@ -377,6 +380,9 @@ static void virtio_write_config(PCIDevice *pci_dev, 
uint32_t address,
 if (PCI_COMMAND == address) {
 if (!(val  PCI_COMMAND_MASTER)) {
 proxy-vdev-status = ~VIRTIO_CONFIG_S_DRIVER_OK;
+if (proxy-vdev-set_status) {
+proxy-vdev-set_status(proxy-vdev);
+}
 }
 }
 
diff --git a/hw/virtio.h b/hw/virtio.h
index 92ad5d1..235e7c4 100644
--- a/hw/virtio.h
+++ b/hw/virtio.h
@@ -114,6 +114,7 @@ struct VirtIODevice
 void (*get_config)(VirtIODevice *vdev, uint8_t *config);
 void (*set_config)(VirtIODevice *vdev, const uint8_t *config);
 void (*reset)(VirtIODevice *vdev);
+void (*set_status)(VirtIODevice *vdev);
 VirtQueue *vq;
 const VirtIOBindings *binding;
 void *binding_opaque;
-- 
1.6.6.144.g5c3af

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 13/20] vhost: vhost net support

2010-02-04 Thread Michael S. Tsirkin
This adds vhost net support in qemu. Will be tied to tap device and
virtio by following patches.  Raw backend is currently missing, will be
worked on/submitted separately.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 Makefile.target |1 +
 hw/vhost.c  |  603 +++
 hw/vhost.h  |   44 
 hw/vhost_net.c  |  147 ++
 hw/vhost_net.h  |   20 ++
 5 files changed, 815 insertions(+), 0 deletions(-)
 create mode 100644 hw/vhost.c
 create mode 100644 hw/vhost.h
 create mode 100644 hw/vhost_net.c
 create mode 100644 hw/vhost_net.h

diff --git a/Makefile.target b/Makefile.target
index 7df9535..09ae105 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -184,6 +184,7 @@ obj-y = vl.o async.o monitor.o pci.o pci_host.o pcie_host.o 
machine.o gdbstub.o
 # need to fix this properly
 obj-y += virtio-blk.o virtio-balloon.o virtio-net.o virtio-pci.o 
virtio-serial-bus.o
 obj-y += notifier.o
+obj-y += vhost_net.o vhost.o
 obj-$(CONFIG_KVM) += kvm.o kvm-all.o
 # MSI-X depends on kvm for interrupt injection,
 # so moved it from Makefile.objs to Makefile.target for now
diff --git a/hw/vhost.c b/hw/vhost.c
new file mode 100644
index 000..e5c1ead
--- /dev/null
+++ b/hw/vhost.c
@@ -0,0 +1,603 @@
+#include linux/vhost.h
+#include sys/ioctl.h
+#include sys/eventfd.h
+#include vhost.h
+#include hw/hw.h
+/* For range_get_last */
+#include pci.h
+
+static void vhost_dev_sync_region(struct vhost_dev *dev,
+uint64_t mfirst, uint64_t mlast,
+uint64_t rfirst, uint64_t rlast)
+{
+   uint64_t start = MAX(mfirst, rfirst);
+   uint64_t end = MIN(mlast, rlast);
+   vhost_log_chunk_t *from = dev-log + start / VHOST_LOG_CHUNK;
+   vhost_log_chunk_t *to = dev-log + end / VHOST_LOG_CHUNK + 1;
+   uint64_t addr = (start / VHOST_LOG_CHUNK) * VHOST_LOG_CHUNK;
+
+   assert(end / VHOST_LOG_CHUNK  dev-log_size);
+   assert(start / VHOST_LOG_CHUNK  dev-log_size);
+   if (end  start) {
+   return;
+   }
+   for (;from  to; ++from) {
+   vhost_log_chunk_t log;
+   int bit;
+   /* We first check with non-atomic: much cheaper,
+* and we expect non-dirty to be the common case. */
+   if (!*from) {
+   continue;
+   }
+   /* Data must be read atomically. We don't really
+* need the barrier semantics of __sync
+* builtins, but it's easier to use them than
+* roll our own. */
+   log = __sync_fetch_and_and(from, 0);
+   while ((bit = sizeof(log)  sizeof(int) ?
+  ffsll(log) : ffs(log))) {
+   bit -= 1;
+   cpu_physical_memory_set_dirty(addr + bit * 
VHOST_LOG_PAGE);
+   log = ~(0x1ull  bit);
+   }
+   addr += VHOST_LOG_CHUNK;
+   }
+}
+
+static int vhost_client_sync_dirty_bitmap(struct CPUPhysMemoryClient *client,
+   target_phys_addr_t start_addr,
+   target_phys_addr_t end_addr)
+{
+   struct vhost_dev *dev = container_of(client, struct vhost_dev, client);
+   int i;
+   if (!dev-log_enabled || !dev-started) {
+   return 0;
+   }
+   for (i = 0; i  dev-mem-nregions; ++i) {
+   struct vhost_memory_region *reg = dev-mem-regions + i;
+   vhost_dev_sync_region(dev, start_addr, end_addr,
+ reg-guest_phys_addr,
+ range_get_last(reg-guest_phys_addr,
+reg-memory_size));
+   }
+   for (i = 0; i  dev-nvqs; ++i) {
+   struct vhost_virtqueue *vq = dev-vqs + i;
+   unsigned size = sizeof(struct vring_used_elem) * vq-num;
+   vhost_dev_sync_region(dev, start_addr, end_addr, vq-used_phys,
+ range_get_last(vq-used_phys, size));
+   }
+   return 0;
+}
+
+/* Assign/unassign. Keep an unsorted array of non-overlapping
+ * memory regions in dev-mem. */
+static void vhost_dev_unassign_memory(struct vhost_dev *dev,
+ uint64_t start_addr,
+ uint64_t size)
+{
+   int from, to, n = dev-mem-nregions;
+   /* Track overlapping/split regions for sanity checking. */
+   int overlap_start = 0, overlap_end = 0, overlap_middle = 0, split = 0;
+
+   for (from = 0, to = 0; from  n; ++from, ++to) {
+   struct vhost_memory_region *reg = dev-mem-regions + to;
+   uint64_t reglast;
+   uint64_t memlast;
+   uint64_t change;
+
+   /* clone old region */
+   if (to != from) {
+   memcpy(reg, 

[PATCH 14/20] tap: add vhost/vhostfd options

2010-02-04 Thread Michael S. Tsirkin
Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 net.c   |8 
 net/tap.c   |   29 +
 qemu-options.hx |4 +++-
 3 files changed, 40 insertions(+), 1 deletions(-)

diff --git a/net.c b/net.c
index 6ef93e6..b942d03 100644
--- a/net.c
+++ b/net.c
@@ -976,6 +976,14 @@ static struct {
 .name = vnet_hdr,
 .type = QEMU_OPT_BOOL,
 .help = enable the IFF_VNET_HDR flag on the tap interface
+}, {
+.name = vhost,
+.type = QEMU_OPT_BOOL,
+.help = enable vhost-net network accelerator,
+}, {
+.name = vhostfd,
+.type = QEMU_OPT_STRING,
+.help = file descriptor of an already opened vhost net 
device,
 },
 #endif /* _WIN32 */
 { /* end of list */ }
diff --git a/net/tap.c b/net/tap.c
index 7e9ca79..d9f2e41 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -41,6 +41,8 @@
 
 #include net/tap-linux.h
 
+#include hw/vhost_net.h
+
 /* Maximum GSO packet size (64k) plus plenty of room for
  * the ethernet and virtio_net headers
  */
@@ -57,6 +59,7 @@ typedef struct TAPState {
 unsigned int has_vnet_hdr : 1;
 unsigned int using_vnet_hdr : 1;
 unsigned int has_ufo: 1;
+struct vhost_net *vhost_net;
 } TAPState;
 
 static int launch_script(const char *setup_script, const char *ifname, int fd);
@@ -252,6 +255,10 @@ static void tap_cleanup(VLANClientState *nc)
 {
 TAPState *s = DO_UPCAST(TAPState, nc, nc);
 
+if (s-vhost_net) {
+vhost_net_cleanup(s-vhost_net);
+}
+
 qemu_purge_queued_packets(nc);
 
 if (s-down_script[0])
@@ -307,6 +314,7 @@ static TAPState *net_tap_fd_init(VLANState *vlan,
 s-has_ufo = tap_probe_has_ufo(s-fd);
 tap_set_offload(s-nc, 0, 0, 0, 0, 0);
 tap_read_poll(s, 1);
+s-vhost_net = NULL;
 return s;
 }
 
@@ -456,6 +464,27 @@ int net_init_tap(QemuOpts *opts, Monitor *mon, const char 
*name, VLANState *vlan
 }
 }
 
+if (qemu_opt_get_bool(opts, vhost, 0)) {
+int vhostfd, r;
+if (qemu_opt_get(opts, vhostfd)) {
+r = net_handle_fd_param(mon, qemu_opt_get(opts, vhostfd));
+if (r == -1) {
+return -1;
+}
+vhostfd = r;
+} else {
+vhostfd = -1;
+}
+s-vhost_net = vhost_net_init(s-nc, vhostfd);
+if (!s-vhost_net) {
+qemu_error(vhost-net requested but could not be initialized\n);
+return -1;
+}
+} else if (qemu_opt_get(opts, vhostfd)) {
+qemu_error(vhostfd= is not valid without vhost\n);
+return -1;
+}
+
 if (vlan) {
 vlan-nb_host_devs++;
 }
diff --git a/qemu-options.hx b/qemu-options.hx
index 5c1c398..4925461 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -832,7 +832,7 @@ DEF(net, HAS_ARG, QEMU_OPTION_net,
 -net tap[,vlan=n][,name=str],ifname=name\n
 connect the host TAP network interface to VLAN 'n'\n
 #else
--net 
tap[,vlan=n][,name=str][,fd=h][,ifname=name][,script=file][,downscript=dfile][,sndbuf=nbytes][,vnet_hdr=on|off]\n
+-net 
tap[,vlan=n][,name=str][,fd=h][,ifname=name][,script=file][,downscript=dfile][,sndbuf=nbytes][,vnet_hdr=on|off][,vhost=on|off][,vhostfd=h]\n
 connect the host TAP network interface to VLAN 'n' and 
use the\n
 network scripts 'file' (default=%s)\n
 and 'dfile' (default=%s)\n
@@ -842,6 +842,8 @@ DEF(net, HAS_ARG, QEMU_OPTION_net,
 default of 'sndbuf=1048576' can be disabled using 
'sndbuf=0')\n
 use vnet_hdr=off to avoid enabling the IFF_VNET_HDR tap 
flag\n
 use vnet_hdr=on to make the lack of IFF_VNET_HDR support 
an error condition\n
+use vhost=on to enable experimental in kernel 
accelerator\n
+use 'vhostfd=h' to connect to an already opened vhost net 
device\n
 #endif
 -net 
socket[,vlan=n][,name=str][,fd=h][,listen=[host]:port][,connect=host:port]\n
 connect the vlan 'n' to another VLAN using a socket 
connection\n
-- 
1.6.6.144.g5c3af

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 15/20] tap: add API to retrieve vhost net header

2010-02-04 Thread Michael S. Tsirkin
will be used by virtio-net for vhost net support

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 net/tap.c |7 +++
 net/tap.h |3 +++
 2 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/net/tap.c b/net/tap.c
index d9f2e41..166cf05 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -491,3 +491,10 @@ int net_init_tap(QemuOpts *opts, Monitor *mon, const char 
*name, VLANState *vlan
 
 return 0;
 }
+
+struct vhost_net *tap_get_vhost_net(VLANClientState *nc)
+{
+TAPState *s = DO_UPCAST(TAPState, nc, nc);
+assert(nc-info-type == NET_CLIENT_TYPE_TAP);
+return s-vhost_net;
+}
diff --git a/net/tap.h b/net/tap.h
index a244b28..b8cec83 100644
--- a/net/tap.h
+++ b/net/tap.h
@@ -50,4 +50,7 @@ void tap_fd_set_offload(int fd, int csum, int tso4, int tso6, 
int ecn, int ufo);
 
 int tap_get_fd(VLANClientState *vc);
 
+struct vhost_net;
+struct vhost_net *tap_get_vhost_net(VLANClientState *vc);
+
 #endif /* QEMU_NET_TAP_H */
-- 
1.6.6.144.g5c3af

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 16/20] virtio-net: vhost net support

2010-02-04 Thread Michael S. Tsirkin
This connects virtio-net to vhost net backend.
The code is structured in a way analogous to what we have with vnet
header capability in tap.  We start/stop backend on driver start/stop as
well as on save and vm start (for migration).

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 hw/virtio-net.c |   67 +-
 1 files changed, 65 insertions(+), 2 deletions(-)

diff --git a/hw/virtio-net.c b/hw/virtio-net.c
index 6e48997..f32c6fa 100644
--- a/hw/virtio-net.c
+++ b/hw/virtio-net.c
@@ -17,6 +17,7 @@
 #include net/tap.h
 #include qemu-timer.h
 #include virtio-net.h
+#include vhost_net.h
 
 #define VIRTIO_NET_VM_VERSION11
 
@@ -47,6 +48,8 @@ typedef struct VirtIONet
 uint8_t nomulti;
 uint8_t nouni;
 uint8_t nobcast;
+uint8_t vhost_started;
+VMChangeStateEntry *vmstate;
 struct {
 int in_use;
 int first_multi;
@@ -114,6 +117,10 @@ static void virtio_net_reset(VirtIODevice *vdev)
 n-nomulti = 0;
 n-nouni = 0;
 n-nobcast = 0;
+if (n-vhost_started) {
+vhost_net_stop(tap_get_vhost_net(n-nic-nc.peer), vdev);
+n-vhost_started = 0;
+}
 
 /* Flush any MAC and VLAN filter table state */
 n-mac_table.in_use = 0;
@@ -172,7 +179,10 @@ static uint32_t virtio_net_get_features(VirtIODevice 
*vdev, uint32_t features)
 features = ~(0x1  VIRTIO_NET_F_HOST_UFO);
 }
 
-return features;
+if (!tap_get_vhost_net(n-nic-nc.peer)) {
+return features;
+}
+return vhost_net_get_features(tap_get_vhost_net(n-nic-nc.peer), 
features);
 }
 
 static uint32_t virtio_net_bad_features(VirtIODevice *vdev)
@@ -690,6 +700,12 @@ static void virtio_net_save(QEMUFile *f, void *opaque)
 {
 VirtIONet *n = opaque;
 
+if (n-vhost_started) {
+   /* TODO: should we really stop the backend?
+* If we don't, it might keep writing to memory. */
+vhost_net_stop(tap_get_vhost_net(n-nic-nc.peer), n-vdev);
+   n-vhost_started = 0;
+}
 virtio_save(n-vdev, f);
 
 qemu_put_buffer(f, n-mac, ETH_ALEN);
@@ -802,7 +818,6 @@ static int virtio_net_load(QEMUFile *f, void *opaque, int 
version_id)
 qemu_mod_timer(n-tx_timer,
qemu_get_clock(vm_clock) + TX_TIMER_INTERVAL);
 }
-
 return 0;
 }
 
@@ -822,6 +837,47 @@ static NetClientInfo net_virtio_info = {
 .link_status_changed = virtio_net_set_link_status,
 };
 
+static void virtio_net_set_status(struct VirtIODevice *vdev)
+{
+VirtIONet *n = to_virtio_net(vdev);
+if (!n-nic-nc.peer) {
+return;
+}
+if (n-nic-nc.peer-info-type != NET_CLIENT_TYPE_TAP) {
+return;
+}
+
+if (!tap_get_vhost_net(n-nic-nc.peer)) {
+return;
+}
+if (!!n-vhost_started == !!(vdev-status  VIRTIO_CONFIG_S_DRIVER_OK)) {
+return;
+}
+if (vdev-status  VIRTIO_CONFIG_S_DRIVER_OK) {
+int r = vhost_net_start(tap_get_vhost_net(n-nic-nc.peer), vdev);
+if (r  0) {
+fprintf(stderr, unable to start vhost net: %d: 
+falling back on userspace virtio\n, -r);
+} else {
+n-vhost_started = 1;
+}
+} else {
+vhost_net_stop(tap_get_vhost_net(n-nic-nc.peer), vdev);
+n-vhost_started = 0;
+}
+}
+
+static void virtio_net_vmstate_change(void *opaque, int running, int reason)
+{
+   VirtIONet *n = opaque;
+   if (!running) {
+   return;
+   }
+   /* This is called when vm is started, it will start vhost backend if it
+* appropriate e.g. after migration. */
+   virtio_net_set_status(n-vdev);
+}
+
 VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf)
 {
 VirtIONet *n;
@@ -837,6 +893,7 @@ VirtIODevice *virtio_net_init(DeviceState *dev, NICConf 
*conf)
 n-vdev.set_features = virtio_net_set_features;
 n-vdev.bad_features = virtio_net_bad_features;
 n-vdev.reset = virtio_net_reset;
+n-vdev.set_status = virtio_net_set_status;
 n-rx_vq = virtio_add_queue(n-vdev, 256, virtio_net_handle_rx);
 n-tx_vq = virtio_add_queue(n-vdev, 256, virtio_net_handle_tx);
 n-ctrl_vq = virtio_add_queue(n-vdev, 64, virtio_net_handle_ctrl);
@@ -859,6 +916,7 @@ VirtIODevice *virtio_net_init(DeviceState *dev, NICConf 
*conf)
 
 register_savevm(virtio-net, virtio_net_id++, VIRTIO_NET_VM_VERSION,
 virtio_net_save, virtio_net_load, n);
+n-vmstate = qemu_add_vm_change_state_handler(virtio_net_vmstate_change, 
n);
 
 return n-vdev;
 }
@@ -866,6 +924,11 @@ VirtIODevice *virtio_net_init(DeviceState *dev, NICConf 
*conf)
 void virtio_net_exit(VirtIODevice *vdev)
 {
 VirtIONet *n = DO_UPCAST(VirtIONet, vdev, vdev);
+qemu_del_vm_change_state_handler(n-vmstate);
+
+if (n-vhost_started) {
+vhost_net_stop(tap_get_vhost_net(n-nic-nc.peer), vdev);
+}
 
 qemu_purge_queued_packets(n-nic-nc);
 
-- 
1.6.6.144.g5c3af

--
To unsubscribe from this list: send the line 

[PATCH 17/20] qemu-kvm: add vhost.h header

2010-02-04 Thread Michael S. Tsirkin
This makes it possible to build vhost support
on systems which do not have this header.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 kvm/include/linux/vhost.h |  130 +
 1 files changed, 130 insertions(+), 0 deletions(-)
 create mode 100644 kvm/include/linux/vhost.h

diff --git a/kvm/include/linux/vhost.h b/kvm/include/linux/vhost.h
new file mode 100644
index 000..165a484
--- /dev/null
+++ b/kvm/include/linux/vhost.h
@@ -0,0 +1,130 @@
+#ifndef _LINUX_VHOST_H
+#define _LINUX_VHOST_H
+/* Userspace interface for in-kernel virtio accelerators. */
+
+/* vhost is used to reduce the number of system calls involved in virtio.
+ *
+ * Existing virtio net code is used in the guest without modification.
+ *
+ * This header includes interface used by userspace hypervisor for
+ * device configuration.
+ */
+
+#include linux/types.h
+
+#include linux/ioctl.h
+#include linux/virtio_config.h
+#include linux/virtio_ring.h
+
+struct vhost_vring_state {
+   unsigned int index;
+   unsigned int num;
+};
+
+struct vhost_vring_file {
+   unsigned int index;
+   int fd; /* Pass -1 to unbind from file. */
+
+};
+
+struct vhost_vring_addr {
+   unsigned int index;
+   /* Option flags. */
+   unsigned int flags;
+   /* Flag values: */
+   /* Whether log address is valid. If set enables logging. */
+#define VHOST_VRING_F_LOG 0
+
+   /* Start of array of descriptors (virtually contiguous) */
+   __u64 desc_user_addr;
+   /* Used structure address. Must be 32 bit aligned */
+   __u64 used_user_addr;
+   /* Available structure address. Must be 16 bit aligned */
+   __u64 avail_user_addr;
+   /* Logging support. */
+   /* Log writes to used structure, at offset calculated from specified
+* address. Address must be 32 bit aligned. */
+   __u64 log_guest_addr;
+};
+
+struct vhost_memory_region {
+   __u64 guest_phys_addr;
+   __u64 memory_size; /* bytes */
+   __u64 userspace_addr;
+   __u64 flags_padding; /* No flags are currently specified. */
+};
+
+/* All region addresses and sizes must be 4K aligned. */
+#define VHOST_PAGE_SIZE 0x1000
+
+struct vhost_memory {
+   __u32 nregions;
+   __u32 padding;
+   struct vhost_memory_region regions[0];
+};
+
+/* ioctls */
+
+#define VHOST_VIRTIO 0xAF
+
+/* Features bitmask for forward compatibility.  Transport bits are used for
+ * vhost specific features. */
+#define VHOST_GET_FEATURES _IOR(VHOST_VIRTIO, 0x00, __u64)
+#define VHOST_SET_FEATURES _IOW(VHOST_VIRTIO, 0x00, __u64)
+
+/* Set current process as the (exclusive) owner of this file descriptor.  This
+ * must be called before any other vhost command.  Further calls to
+ * VHOST_OWNER_SET fail until VHOST_OWNER_RESET is called. */
+#define VHOST_SET_OWNER _IO(VHOST_VIRTIO, 0x01)
+/* Give up ownership, and reset the device to default values.
+ * Allows subsequent call to VHOST_OWNER_SET to succeed. */
+#define VHOST_RESET_OWNER _IO(VHOST_VIRTIO, 0x02)
+
+/* Set up/modify memory layout */
+#define VHOST_SET_MEM_TABLE_IOW(VHOST_VIRTIO, 0x03, struct vhost_memory)
+
+/* Write logging setup. */
+/* Memory writes can optionally be logged by setting bit at an offset
+ * (calculated from the physical address) from specified log base.
+ * The bit is set using an atomic 32 bit operation. */
+/* Set base address for logging. */
+#define VHOST_SET_LOG_BASE _IOW(VHOST_VIRTIO, 0x04, __u64)
+/* Specify an eventfd file descriptor to signal on log write. */
+#define VHOST_SET_LOG_FD _IOW(VHOST_VIRTIO, 0x07, int)
+
+/* Ring setup. */
+/* Set number of descriptors in ring. This parameter can not
+ * be modified while ring is running (bound to a device). */
+#define VHOST_SET_VRING_NUM _IOW(VHOST_VIRTIO, 0x10, struct vhost_vring_state)
+/* Set addresses for the ring. */
+#define VHOST_SET_VRING_ADDR _IOW(VHOST_VIRTIO, 0x11, struct vhost_vring_addr)
+/* Base value where queue looks for available descriptors */
+#define VHOST_SET_VRING_BASE _IOW(VHOST_VIRTIO, 0x12, struct vhost_vring_state)
+/* Get accessor: reads index, writes value in num */
+#define VHOST_GET_VRING_BASE _IOWR(VHOST_VIRTIO, 0x12, struct 
vhost_vring_state)
+
+/* The following ioctls use eventfd file descriptors to signal and poll
+ * for events. */
+
+/* Set eventfd to poll for added buffers */
+#define VHOST_SET_VRING_KICK _IOW(VHOST_VIRTIO, 0x20, struct vhost_vring_file)
+/* Set eventfd to signal when buffers have beed used */
+#define VHOST_SET_VRING_CALL _IOW(VHOST_VIRTIO, 0x21, struct vhost_vring_file)
+/* Set eventfd to signal an error */
+#define VHOST_SET_VRING_ERR _IOW(VHOST_VIRTIO, 0x22, struct vhost_vring_file)
+
+/* VHOST_NET specific defines */
+
+/* Attach virtio net ring to a raw socket, or tap device.
+ * The socket must be already bound to an ethernet device, this device will be
+ * used for transmit.  Pass fd -1 to unbind from the socket and the transmit
+ * device.  This can be used to stop the ring 

[PATCH 18/20] kvm: irqfd support

2010-02-04 Thread Michael S. Tsirkin
Add API to assign/deassign irqfd to kvm.
Add stub so that users do not have to use
ifdefs.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 kvm-all.c |   19 +++
 kvm.h |   10 ++
 2 files changed, 29 insertions(+), 0 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index 6dbe480..3c692f6 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1154,4 +1154,23 @@ int kvm_set_ioeventfd(uint16_t addr, uint16_t data, int 
fd, bool assigned)
 }
 #endif
 
+#if defined(KVM_IRQFD)
+int kvm_set_irqfd(int gsi, int fd, bool assigned)
+{
+struct kvm_irqfd irqfd = {
+.fd = fd,
+.gsi = gsi,
+.flags = assigned ? 0 : KVM_IRQFD_FLAG_DEASSIGN,
+};
+int r;
+if (!kvm_irqchip_in_kernel())
+return -ENOSYS;
+
+r = kvm_vm_ioctl(kvm_state, KVM_IRQFD, irqfd);
+if (r  0)
+return r;
+return 0;
+}
+#endif
+
 #include qemu-kvm.c
diff --git a/kvm.h b/kvm.h
index 365d8b1..e762a5b 100644
--- a/kvm.h
+++ b/kvm.h
@@ -159,4 +159,14 @@ int kvm_set_ioeventfd(uint16_t data, uint16_t addr, int 
fd, bool assigned)
 }
 #endif
 
+#if defined(KVM_IRQFD)  defined(CONFIG_KVM)
+int kvm_set_irqfd(int gsi, int fd, bool assigned);
+#else
+static inline
+int kvm_set_irqfd(int gsi, int fd, bool assigned)
+{
+return -ENOSYS;
+}
+#endif
+
 #endif
-- 
1.6.6.144.g5c3af

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 19/20] msix: add mask/unmask notifiers

2010-02-04 Thread Michael S. Tsirkin
Support per-vector callbacks for msix mask/unmask.
Will be used for vhost net.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 hw/msix.c |   36 +++-
 hw/msix.h |1 +
 hw/pci.h  |6 ++
 3 files changed, 42 insertions(+), 1 deletions(-)

diff --git a/hw/msix.c b/hw/msix.c
index 87f125b..31a61c6 100644
--- a/hw/msix.c
+++ b/hw/msix.c
@@ -318,6 +318,13 @@ static void msix_mmio_writel(void *opaque, 
target_phys_addr_t addr,
 if (kvm_enabled()  kvm_irqchip_in_kernel()) {
 kvm_msix_update(dev, vector, was_masked, msix_is_masked(dev, vector));
 }
+if (was_masked != msix_is_masked(dev, vector) 
+dev-msix_mask_notifier  dev-msix_mask_notifier_opaque[vector]) {
+int r = dev-msix_mask_notifier(dev, vector,
+   dev-msix_mask_notifier_opaque[vector],
+   msix_is_masked(dev, vector));
+assert(r = 0);
+}
 msix_handle_mask_update(dev, vector);
 }
 
@@ -356,10 +363,18 @@ void msix_mmio_map(PCIDevice *d, int region_num,
 
 static void msix_mask_all(struct PCIDevice *dev, unsigned nentries)
 {
-int vector;
+int vector, r;
 for (vector = 0; vector  nentries; ++vector) {
 unsigned offset = vector * MSIX_ENTRY_SIZE + MSIX_VECTOR_CTRL;
+int was_masked = msix_is_masked(dev, vector);
 dev-msix_table_page[offset] |= MSIX_VECTOR_MASK;
+if (was_masked != msix_is_masked(dev, vector) 
+dev-msix_mask_notifier  dev-msix_mask_notifier_opaque[vector]) 
{
+r = dev-msix_mask_notifier(dev, vector,
+dev-msix_mask_notifier_opaque[vector],
+msix_is_masked(dev, vector));
+assert(r = 0);
+}
 }
 }
 
@@ -382,6 +397,9 @@ int msix_init(struct PCIDevice *dev, unsigned short 
nentries,
 sizeof *dev-msix_irq_entries);
 }
 #endif
+dev-msix_mask_notifier_opaque =
+qemu_mallocz(nentries * sizeof *dev-msix_mask_notifier_opaque);
+dev-msix_mask_notifier = NULL;
 dev-msix_entry_used = qemu_mallocz(MSIX_MAX_ENTRIES *
 sizeof *dev-msix_entry_used);
 
@@ -444,6 +462,8 @@ int msix_uninit(PCIDevice *dev)
 dev-msix_entry_used = NULL;
 qemu_free(dev-msix_irq_entries);
 dev-msix_irq_entries = NULL;
+qemu_free(dev-msix_mask_notifier_opaque);
+dev-msix_mask_notifier_opaque = NULL;
 dev-cap_present = ~QEMU_PCI_CAP_MSIX;
 return 0;
 }
@@ -587,3 +607,17 @@ void msix_unuse_all_vectors(PCIDevice *dev)
 return;
 msix_free_irq_entries(dev);
 }
+
+int msix_set_mask_notifier(PCIDevice *dev, unsigned vector, void *opaque)
+{
+int r = 0;
+if (vector = dev-msix_entries_nr || !dev-msix_entry_used[vector])
+return 0;
+
+if (dev-msix_mask_notifier)
+r = dev-msix_mask_notifier(dev, vector, opaque,
+msix_is_masked(dev, vector));
+if (r = 0)
+dev-msix_mask_notifier_opaque[vector] = opaque;
+return r;
+}
diff --git a/hw/msix.h b/hw/msix.h
index a9f7993..f167231 100644
--- a/hw/msix.h
+++ b/hw/msix.h
@@ -33,4 +33,5 @@ void msix_reset(PCIDevice *dev);
 
 extern int msix_supported;
 
+int msix_set_mask_notifier(PCIDevice *dev, unsigned vector, void *opaque);
 #endif
diff --git a/hw/pci.h b/hw/pci.h
index c9e9d56..a4a0fe9 100644
--- a/hw/pci.h
+++ b/hw/pci.h
@@ -137,6 +137,9 @@ enum {
 #define PCI_CAPABILITY_CONFIG_MSI_LENGTH 0x10
 #define PCI_CAPABILITY_CONFIG_MSIX_LENGTH 0x10
 
+typedef int (*msix_mask_notifier_func)(PCIDevice *, unsigned vector,
+  void *opaque, int masked);
+
 struct PCIDevice {
 DeviceState qdev;
 /* PCI config space */
@@ -202,6 +205,9 @@ struct PCIDevice {
 
 struct kvm_irq_routing_entry *msix_irq_entries;
 
+void **msix_mask_notifier_opaque;
+msix_mask_notifier_func msix_mask_notifier;
+
 /* Device capability configuration space */
 struct {
 int supported;
-- 
1.6.6.144.g5c3af

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 20/20] virtio-pci: irqfd support

2010-02-04 Thread Michael S. Tsirkin
Use irqfd when supported by kernel.
This uses msix mask notifiers: when vector is masked, we poll it from
userspace.  When it is unmasked, we poll it from kernel.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 hw/virtio-pci.c |   31 +--
 1 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index 02859a7..f3ed939 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -406,6 +406,27 @@ static void virtio_pci_guest_notifier_read(void *opaque)
 }
 }
 
+static int virtio_pci_mask_notifier(PCIDevice *dev, unsigned vector,
+void *opaque, int masked)
+{
+VirtQueue *vq = opaque;
+EventNotifier *notifier = virtio_queue_guest_notifier(vq);
+int r = kvm_set_irqfd(dev-msix_irq_entries[vector].gsi,
+  event_notifier_get_fd(notifier),
+  !masked);
+if (r  0) {
+return (r == -ENOSYS) ? 0 : r;
+}
+if (masked) {
+qemu_set_fd_handler(event_notifier_get_fd(notifier),
+virtio_pci_guest_notifier_read, NULL, vq);
+} else {
+qemu_set_fd_handler(event_notifier_get_fd(notifier),
+NULL, NULL, vq);
+}
+return 0;
+}
+
 static int virtio_pci_guest_notifier(void *opaque, int n, bool assign)
 {
 VirtIOPCIProxy *proxy = opaque;
@@ -414,11 +435,15 @@ static int virtio_pci_guest_notifier(void *opaque, int n, 
bool assign)
 
 if (assign) {
 int r = event_notifier_init(notifier, 0);
-   if (r  0)
-   return r;
+if (r  0)
+return r;
 qemu_set_fd_handler(event_notifier_get_fd(notifier),
 virtio_pci_guest_notifier_read, NULL, vq);
+msix_set_mask_notifier(proxy-pci_dev,
+  virtio_queue_vector(proxy-vdev, n), vq);
 } else {
+msix_set_mask_notifier(proxy-pci_dev,
+  virtio_queue_vector(proxy-vdev, n), NULL);
 qemu_set_fd_handler(event_notifier_get_fd(notifier),
 NULL, NULL, vq);
 event_notifier_cleanup(notifier);
@@ -503,6 +528,8 @@ static void virtio_init_pci(VirtIOPCIProxy *proxy, 
VirtIODevice *vdev,
 
 proxy-pci_dev.config_write = virtio_write_config;
 
+proxy-pci_dev.msix_mask_notifier = virtio_pci_mask_notifier;
+
 size = VIRTIO_PCI_REGION_SIZE(proxy-pci_dev) + vdev-config_len;
 if (size  (size-1))
 size = 1  qemu_fls(size);
-- 
1.6.6.144.g5c3af
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] emulate accessed bit for EPT

2010-02-04 Thread Rik van Riel

Balbir Singh wrote:

* Rik van Riel r...@redhat.com [2010-02-04 08:40:43]:


On 02/03/2010 11:12 PM, Balbir Singh wrote:

* Rik van Rielr...@redhat.com  [2010-02-03 16:11:03]:


Currently KVM pretends that pages with EPT mappings never got
accessed.  This has some side effects in the VM, like swapping
out actively used guest pages and needlessly breaking up actively
used hugepages.

We can avoid those very costly side effects by emulating the
accessed bit for EPT PTEs, which should only be slightly costly
because pages pass through page_referenced infrequently.

Quite a clever implementation, one side effect is that one would see a
larger number of minor faults with EPT enabled and an increase in
allocation/frees of rmap entries, but that can be easily explained.

I suspect it won't be very many. I have been monitoring
/proc/meminfo on my system while testing this patch, and
it is quite typical that the size of the inactive anon
list does not change for minutes at a time.

In other words, no pages are moved onto or off of the
inactive anon list for several minutes. That corresponds
to a very small number of minor faults introduced by my
patch.

Of course, when the system is swapping, we will have more
minor faults.  However, minor faults should be less of a
performance issue than major faults :)



I do agree with you. 


After 20 hours of uptime, it appears that this patch has
resolved the KVM guests get swapped while buffer and page
cache stay in memory problem my home system was experiencing.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 08/18] KVM: PPC: Preload FPU when possible

2010-02-04 Thread Alexander Graf
There are some situations when we're pretty sure the guest will use the
FPU soon. So we can save the churn of going into the guest, finding out
it does want to use the FPU and going out again.

This patch adds preloading of the FPU when it's reasonable.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s.c |8 
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 6bdf7f2..07f8b42 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -137,6 +137,10 @@ void kvmppc_set_msr(struct kvm_vcpu *vcpu, u64 msr)
kvmppc_mmu_flush_segments(vcpu);
kvmppc_mmu_map_segment(vcpu, vcpu-arch.pc);
}
+
+   /* Preload FPU if it's enabled */
+   if (vcpu-arch.msr  MSR_FP)
+   kvmppc_handle_ext(vcpu, BOOK3S_INTERRUPT_FP_UNAVAIL, MSR_FP);
 }
 
 void kvmppc_inject_interrupt(struct kvm_vcpu *vcpu, int vec, u64 flags)
@@ -1194,6 +1198,10 @@ int __kvmppc_vcpu_run(struct kvm_run *kvm_run, struct 
kvm_vcpu *vcpu)
/* XXX we get called with irq disabled - change that! */
local_irq_enable();
 
+   /* Preload FPU if it's enabled */
+   if (vcpu-arch.msr  MSR_FP)
+   kvmppc_handle_ext(vcpu, BOOK3S_INTERRUPT_FP_UNAVAIL, MSR_FP);
+
ret = __kvmppc_vcpu_entry(kvm_run, vcpu);
 
local_irq_disable();
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 05/18] KVM: PPC: Add hidden flag for paired singles

2010-02-04 Thread Alexander Graf
The Gekko implements an extension called paired singles. When the guest wants
to use that extension, we need to make sure we're not running the host FPU,
because all FPU instructions need to get emulated to accomodate for additional
operations that occur.

This patch adds an hflag to track if we're in paired single mode or not.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_asm.h |1 +
 arch/powerpc/kvm/book3s.c  |4 
 2 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_asm.h 
b/arch/powerpc/include/asm/kvm_asm.h
index aadf2dd..7238c04 100644
--- a/arch/powerpc/include/asm/kvm_asm.h
+++ b/arch/powerpc/include/asm/kvm_asm.h
@@ -88,6 +88,7 @@
 
 #define BOOK3S_HFLAG_DCBZ320x1
 #define BOOK3S_HFLAG_SLB   0x2
+#define BOOK3S_HFLAG_PAIRED_SINGLE 0x4
 
 #define RESUME_FLAG_NV  (10)  /* Reload guest nonvolatile state? */
 #define RESUME_FLAG_HOST(11)  /* Resume host? */
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 1e5e0fc..96f7be4 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -638,6 +638,10 @@ static int kvmppc_handle_ext(struct kvm_vcpu *vcpu, 
unsigned int exit_nr,
u64 *thread_fpr = (u64*)t-fpr;
int i;
 
+   /* When we have paired singles, we emulate in software */
+   if (vcpu-arch.hflags  BOOK3S_HFLAG_PAIRED_SINGLE)
+   return RESUME_GUEST;
+
if (!(vcpu-arch.msr  msr)) {
kvmppc_book3s_queue_irqprio(vcpu, exit_nr);
return RESUME_GUEST;
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/18] KVM: PPC: Add QPR registers

2010-02-04 Thread Alexander Graf
The Gekko has GPRs, SPRs and FPRs like normal PowerPC codes, but
it also has QPRs which are basically single precision only FPU registers
that get used when in paired single mode.

The following patches depend on them being around, so let's add the
definitions early.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_host.h |5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 715aa6b..2ed954e 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -186,6 +186,11 @@ struct kvm_vcpu_arch {
u64 vsr[32];
 #endif
 
+#ifdef CONFIG_PPC_BOOK3S
+   /* For Gekko paired singles */
+   u32 qpr[32];
+#endif
+
ulong pc;
ulong ctr;
ulong lr;
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 09/18] KVM: PPC: Fix typo in book3s_32 debug code

2010-02-04 Thread Alexander Graf
There's a typo in the debug ifdef of the book3s_32 mmu emulation. While trying
to debug something I stumbled across that and wanted to save anyone after me
(or myself later) from having to debug that again.

So let's fix the ifdef.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_32_mmu.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_32_mmu.c b/arch/powerpc/kvm/book3s_32_mmu.c
index faf99f2..1483a9b 100644
--- a/arch/powerpc/kvm/book3s_32_mmu.c
+++ b/arch/powerpc/kvm/book3s_32_mmu.c
@@ -37,7 +37,7 @@
 #define dprintk(X...) do { } while(0)
 #endif
 
-#ifdef DEBUG_PTE
+#ifdef DEBUG_MMU_PTE
 #define dprintk_pte(X...) printk(KERN_INFO X)
 #else
 #define dprintk_pte(X...) do { } while(0)
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 12/18] KVM: PPC: Make ext giveup non-static

2010-02-04 Thread Alexander Graf
We need to call the ext giveup handlers from code outside of book3s.c.
So let's make it non-static.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_book3s.h |1 +
 arch/powerpc/kvm/book3s.c |3 +--
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index 8463976..fd43210 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -120,6 +120,7 @@ extern int kvmppc_st(struct kvm_vcpu *vcpu, ulong *eaddr, 
int size, void *ptr, b
 extern void kvmppc_book3s_queue_irqprio(struct kvm_vcpu *vcpu, unsigned int 
vec);
 extern void kvmppc_set_bat(struct kvm_vcpu *vcpu, struct kvmppc_bat *bat,
   bool upper, u32 val);
+extern void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong msr);
 
 extern u32 kvmppc_trampoline_lowmem;
 extern u32 kvmppc_trampoline_enter;
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index e8dccc6..99e9e07 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -35,7 +35,6 @@
 /* #define EXIT_DEBUG_SIMPLE */
 /* #define DEBUG_EXT */
 
-static void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong msr);
 static int kvmppc_handle_ext(struct kvm_vcpu *vcpu, unsigned int exit_nr,
 ulong msr);
 
@@ -597,7 +596,7 @@ static inline int get_fpr_index(int i)
 }
 
 /* Give up external provider (FPU, Altivec, VSX) */
-static void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong msr)
+void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong msr)
 {
struct thread_struct *t = current-thread;
u64 *vcpu_fpr = vcpu-arch.fpr;
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 04/18] KVM: PPC: Add AGAIN type for emulation return

2010-02-04 Thread Alexander Graf
Emulation of an instruction can have different outcomes. It can succeed,
fail, require MMIO, do funky BookE stuff - or it can just realize something's
odd and will be fixed the next time around.

Exactly that is what EMULATE_AGAIN means. Using that flag we can now tell
the caller that nothing happened, but we still want to go back to the
guest and see what happens next time we come around.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_ppc.h |1 +
 arch/powerpc/kvm/book3s.c  |3 +++
 arch/powerpc/kvm/emulate.c |4 +++-
 3 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index a288dd2..0761218 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -37,6 +37,7 @@ enum emulation_result {
EMULATE_DO_MMIO,  /* kvm_run filled with MMIO request */
EMULATE_DO_DCR,   /* kvm_run filled with DCR request */
EMULATE_FAIL, /* can't emulate this instruction */
+   EMULATE_AGAIN,/* something went wrong. go again */
 };
 
 extern int __kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu);
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 9a271f0..1e5e0fc 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -788,6 +788,9 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu 
*vcpu,
case EMULATE_DONE:
r = RESUME_GUEST_NV;
break;
+   case EMULATE_AGAIN:
+   r = RESUME_GUEST;
+   break;
case EMULATE_FAIL:
printk(KERN_CRIT %s: emulation at %lx failed (%08x)\n,
   __func__, vcpu-arch.pc, vcpu-arch.last_inst);
diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c
index ef2ff59..c3bab7f 100644
--- a/arch/powerpc/kvm/emulate.c
+++ b/arch/powerpc/kvm/emulate.c
@@ -486,7 +486,9 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct 
kvm_vcpu *vcpu)
 
if (emulated == EMULATE_FAIL) {
emulated = kvmppc_core_emulate_op(run, vcpu, inst, advance);
-   if (emulated == EMULATE_FAIL) {
+   if (emulated == EMULATE_AGAIN) {
+   advance = 0;
+   } else if (emulated == EMULATE_FAIL) {
advance = 0;
printk(KERN_ERR Couldn't emulate instruction 0x%08x 
   (op %d xop %d)\n, inst, get_op(inst), 
get_xop(inst));
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 18/18] KVM: PPC: Implement Paired Single emulation

2010-02-04 Thread Alexander Graf
The one big thing about the Gekko is paired singles.

Paired singles are an extension to the instruction set, that adds 32 single
precision floating point registers (qprs), some SPRs to modify the behavior
of paired singled operations and instructions to deal with qprs to the
instruction set.

Unfortunately, it also changes semantics of existing operations that affect
single values in FPRs. In most cases they get mirrored to the coresponding
QPR.

Thanks to that we need to emulate all FPU operations and all the new paired
single operations too.

In order to achieve that, we take the guest's instruction, rip out the
parameters, put in our own and execute the very same instruction, but also
fix up the QPR values along the way.

That way we can execute paired single FPU operations without implementing a
soft fpu.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_book3s.h|1 +
 arch/powerpc/kvm/Makefile|1 +
 arch/powerpc/kvm/book3s_64_emulate.c |3 +
 arch/powerpc/kvm/book3s_paired_singles.c | 1356 ++
 4 files changed, 1361 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/kvm/book3s_paired_singles.c

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index f74d1db..e32a749 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -121,6 +121,7 @@ extern void kvmppc_book3s_queue_irqprio(struct kvm_vcpu 
*vcpu, unsigned int vec)
 extern void kvmppc_set_bat(struct kvm_vcpu *vcpu, struct kvmppc_bat *bat,
   bool upper, u32 val);
 extern void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong msr);
+extern int kvmppc_emulate_paired_single(struct kvm_run *run, struct kvm_vcpu 
*vcpu);
 
 extern u32 kvmppc_trampoline_lowmem;
 extern u32 kvmppc_trampoline_enter;
diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile
index e575cfd..eba721e 100644
--- a/arch/powerpc/kvm/Makefile
+++ b/arch/powerpc/kvm/Makefile
@@ -41,6 +41,7 @@ kvm-objs-$(CONFIG_KVM_E500) := $(kvm-e500-objs)
 kvm-book3s_64-objs := \
$(common-objs-y) \
fpu.o \
+   book3s_paired_singles.o \
book3s.o \
book3s_64_emulate.o \
book3s_64_interrupts.o \
diff --git a/arch/powerpc/kvm/book3s_64_emulate.c 
b/arch/powerpc/kvm/book3s_64_emulate.c
index 1d1b952..c989214 100644
--- a/arch/powerpc/kvm/book3s_64_emulate.c
+++ b/arch/powerpc/kvm/book3s_64_emulate.c
@@ -200,6 +200,9 @@ int kvmppc_core_emulate_op(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
emulated = EMULATE_FAIL;
}
 
+   if (emulated == EMULATE_FAIL)
+   emulated = kvmppc_emulate_paired_single(run, vcpu);
+
return emulated;
 }
 
diff --git a/arch/powerpc/kvm/book3s_paired_singles.c 
b/arch/powerpc/kvm/book3s_paired_singles.c
new file mode 100644
index 000..cb258a3
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_paired_singles.c
@@ -0,0 +1,1356 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright Novell Inc 2010
+ *
+ * Authors: Alexander Graf ag...@suse.de
+ */
+
+#include asm/kvm.h
+#include asm/kvm_ppc.h
+#include asm/disassemble.h
+#include asm/kvm_book3s.h
+#include asm/kvm_fpu.h
+#include asm/reg.h
+#include asm/cacheflush.h
+#include linux/vmalloc.h
+
+/* #define DEBUG */
+
+#ifdef DEBUG
+#define dprintk printk
+#else
+#define dprintk(...) do { } while(0);
+#endif
+
+#define OP_LFS 48
+#define OP_LFSU49
+#define OP_LFD 50
+#define OP_LFDU51
+#define OP_STFS52
+#define OP_STFSU   53
+#define OP_STFD54
+#define OP_STFDU   55
+#define OP_PSQ_L   56
+#define OP_PSQ_LU  57
+#define OP_PSQ_ST  60
+#define OP_PSQ_STU 61
+
+#define OP_31_LFSX 535
+#define OP_31_LFSUX567
+#define OP_31_LFDX 599
+#define OP_31_LFDUX631
+#define OP_31_STFSX663
+#define OP_31_STFSUX   695
+#define OP_31_STFX 727
+#define OP_31_STFUX759
+#define OP_31_LWIZX887
+#define OP_31_STFIWX   983
+
+#define OP_59_FADDS21
+#define OP_59_FSUBS20
+#define OP_59_FSQRTS   22

[PATCH 13/18] KVM: PPC: Add helpers to call FPU instructions

2010-02-04 Thread Alexander Graf
To emulate paired single instructions, we need to be able to call FPU
operations from within the kernel. Since we don't want gcc to spill
arbitrary FPU code everywhere, we tell it to use a soft fpu.

Since we know we can really call the FPU in safe areas, let's also add
some calls that we can later use to actually execute real world FPU
operations on the host's FPU.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_fpu.h |   45 +
 arch/powerpc/kernel/ppc_ksyms.c|2 +
 arch/powerpc/kvm/Makefile  |1 +
 arch/powerpc/kvm/fpu.S |   77 
 4 files changed, 125 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/include/asm/kvm_fpu.h
 create mode 100644 arch/powerpc/kvm/fpu.S

diff --git a/arch/powerpc/include/asm/kvm_fpu.h 
b/arch/powerpc/include/asm/kvm_fpu.h
new file mode 100644
index 000..2e42eb7
--- /dev/null
+++ b/arch/powerpc/include/asm/kvm_fpu.h
@@ -0,0 +1,45 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright Novell Inc. 2010
+ *
+ * Authors: Alexander Graf ag...@suse.de
+ */
+
+#ifndef __ASM_KVM_FPU_H__
+#define __ASM_KVM_FPU_H__
+
+#include linux/types.h
+
+extern void fp_fres(struct thread_struct *t, u32 *dst, u32 *src1);
+extern void fp_frsqrte(struct thread_struct *t, u32 *dst, u32 *src1);
+extern void fp_fsqrts(struct thread_struct *t, u32 *dst, u32 *src1);
+
+extern void fp_fadds(struct thread_struct *t, u32 *dst, u32 *src1, u32 *src2);
+extern void fp_fdivs(struct thread_struct *t, u32 *dst, u32 *src1, u32 *src2);
+extern void fp_fmuls(struct thread_struct *t, u32 *dst, u32 *src1, u32 *src2);
+extern void fp_fsubs(struct thread_struct *t, u32 *dst, u32 *src1, u32 *src2);
+
+extern void fp_fmadds(struct thread_struct *t, u32 *dst, u32 *src1, u32 *src2,
+ u32 *src3);
+extern void fp_fmsubs(struct thread_struct *t, u32 *dst, u32 *src1, u32 *src2,
+ u32 *src3);
+extern void fp_fnmadds(struct thread_struct *t, u32 *dst, u32 *src1, u32 *src2,
+  u32 *src3);
+extern void fp_fnmsubs(struct thread_struct *t, u32 *dst, u32 *src1, u32 *src2,
+  u32 *src3);
+extern void fp_fsel(struct thread_struct *t, u32 *dst, u32 *src1, u32 *src2,
+   u32 *src3);
+
+#endif
diff --git a/arch/powerpc/kernel/ppc_ksyms.c b/arch/powerpc/kernel/ppc_ksyms.c
index ab3e392..58fdb3a 100644
--- a/arch/powerpc/kernel/ppc_ksyms.c
+++ b/arch/powerpc/kernel/ppc_ksyms.c
@@ -101,6 +101,8 @@ EXPORT_SYMBOL(pci_dram_offset);
 EXPORT_SYMBOL(start_thread);
 EXPORT_SYMBOL(kernel_thread);
 
+EXPORT_SYMBOL_GPL(cvt_df);
+EXPORT_SYMBOL_GPL(cvt_fd);
 EXPORT_SYMBOL(giveup_fpu);
 #ifdef CONFIG_ALTIVEC
 EXPORT_SYMBOL(giveup_altivec);
diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile
index 56484d6..e575cfd 100644
--- a/arch/powerpc/kvm/Makefile
+++ b/arch/powerpc/kvm/Makefile
@@ -40,6 +40,7 @@ kvm-objs-$(CONFIG_KVM_E500) := $(kvm-e500-objs)
 
 kvm-book3s_64-objs := \
$(common-objs-y) \
+   fpu.o \
book3s.o \
book3s_64_emulate.o \
book3s_64_interrupts.o \
diff --git a/arch/powerpc/kvm/fpu.S b/arch/powerpc/kvm/fpu.S
new file mode 100644
index 000..50575ac
--- /dev/null
+++ b/arch/powerpc/kvm/fpu.S
@@ -0,0 +1,77 @@
+/*
+ *  FPU helper code to use FPU operations from inside the kernel
+ *
+ *Copyright (C) 2010 Alexander Graf (ag...@suse.de)
+ *
+ *  This program is free software; you can redistribute it and/or
+ *  modify it under the terms of the GNU General Public License
+ *  as published by the Free Software Foundation; either version
+ *  2 of the License, or (at your option) any later version.
+ *
+ */
+
+#include asm/reg.h
+#include asm/page.h
+#include asm/mmu.h
+#include asm/pgtable.h
+#include asm/cputable.h
+#include asm/cache.h
+#include asm/thread_info.h
+#include asm/ppc_asm.h
+#include asm/asm-offsets.h
+
+#define FPS_ONE_IN(name)   \
+_GLOBAL(fp_ ## name);  \
+   lfd 0,THREAD_FPSCR(r3); /* load up fpscr value */   \
+   MTFSF_L(0); \
+   lfs 0,0(r5);\
+   \
+

[PATCH 16/18] KVM: PPC: Enable program interrupt to do MMIO

2010-02-04 Thread Alexander Graf
When we get a program interrupt we usually don't expect it to perform an
MMIO operation. But why not? When we emulate paired singles, we can end
up loading or storing to an MMIO address - and the handling of those
happens in the program interrupt handler.

So let's teach the program interrupt handler how to deal with EMULATE_MMIO.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 99e9e07..f842d1d 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -840,6 +840,10 @@ program_interrupt:
kvmppc_core_queue_program(vcpu, flags);
r = RESUME_GUEST;
break;
+   case EMULATE_DO_MMIO:
+   run-exit_reason = KVM_EXIT_MMIO;
+   r = RESUME_HOST_NV;
+   break;
default:
BUG();
}
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/18] KVM: PPC: Implement mtsr instruction emulation

2010-02-04 Thread Alexander Graf
The Book3S_32 specifications allows for two instructions to modify segment
registers: mtsrin and mtsr.

Most normal operating systems use mtsrin, because it allows to define which
segment it wants to change using a register. But since I was trying to run
an embedded guest, it turned out to be using mtsr with hardcoded values.

So let's also emulate mtsr. It's a valid instruction after all.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_64_emulate.c |6 ++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_64_emulate.c 
b/arch/powerpc/kvm/book3s_64_emulate.c
index bb4a7c1..e4e7ec3 100644
--- a/arch/powerpc/kvm/book3s_64_emulate.c
+++ b/arch/powerpc/kvm/book3s_64_emulate.c
@@ -28,6 +28,7 @@
 #define OP_31_XOP_MFMSR83
 #define OP_31_XOP_MTMSR146
 #define OP_31_XOP_MTMSRD   178
+#define OP_31_XOP_MTSR 210
 #define OP_31_XOP_MTSRIN   242
 #define OP_31_XOP_TLBIEL   274
 #define OP_31_XOP_TLBIE306
@@ -101,6 +102,11 @@ int kvmppc_core_emulate_op(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
}
break;
}
+   case OP_31_XOP_MTSR:
+   vcpu-arch.mmu.mtsrin(vcpu,
+   (inst  16)  0xf,
+   kvmppc_get_gpr(vcpu, get_rs(inst)));
+   break;
case OP_31_XOP_MTSRIN:
vcpu-arch.mmu.mtsrin(vcpu,
(kvmppc_get_gpr(vcpu, get_rb(inst))  28)  
0xf,
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 14/18] KVM: PPC: Fix error in BAT assignment

2010-02-04 Thread Alexander Graf
BATs didn't work. Well, they did, but only up to BAT3. As soon as we
came to BAT4 the offset calculation was screwed up and we ended up
overwriting BAT0-3.

Fortunately, Linux hasn't been using BAT4+. It's still a good
idea to write correct code though.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_64_emulate.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_64_emulate.c 
b/arch/powerpc/kvm/book3s_64_emulate.c
index a93aa47..1d1b952 100644
--- a/arch/powerpc/kvm/book3s_64_emulate.c
+++ b/arch/powerpc/kvm/book3s_64_emulate.c
@@ -233,13 +233,13 @@ static void kvmppc_write_bat(struct kvm_vcpu *vcpu, int 
sprn, u32 val)
bat = vcpu_book3s-ibat[(sprn - SPRN_IBAT0U) / 2];
break;
case SPRN_IBAT4U ... SPRN_IBAT7L:
-   bat = vcpu_book3s-ibat[(sprn - SPRN_IBAT4U) / 2];
+   bat = vcpu_book3s-ibat[4 + ((sprn - SPRN_IBAT4U) / 2)];
break;
case SPRN_DBAT0U ... SPRN_DBAT3L:
bat = vcpu_book3s-dbat[(sprn - SPRN_DBAT0U) / 2];
break;
case SPRN_DBAT4U ... SPRN_DBAT7L:
-   bat = vcpu_book3s-dbat[(sprn - SPRN_DBAT4U) / 2];
+   bat = vcpu_book3s-dbat[4 + ((sprn - SPRN_DBAT4U) / 2)];
break;
default:
BUG();
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 17/18] KVM: PPC: Reserve a chunk of memory for opcodes

2010-02-04 Thread Alexander Graf
With paired singles we have a nifty instruction execution engine. That
engine takes safe and properly cleared FPU opcodes and executes them
directly on the hardware.

Since we can't run off the stack and modifying .bss isn't future-proof
either, the best method seemed to be to vmalloc an executable chunk
of memory.

This chunk will be used by the following patch.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_book3s.h |1 +
 arch/powerpc/include/asm/kvm_ppc.h|4 
 arch/powerpc/kvm/book3s.c |   14 +-
 3 files changed, 18 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index fd43210..f74d1db 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -144,5 +144,6 @@ static inline ulong dsisr(void)
 extern void kvm_return_point(void);
 
 #define INS_DCBZ   0x7c0007ec
+#define INS_BLR0x4e800020
 
 #endif /* __ASM_KVM_BOOK3S_H__ */
diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index c7fcdd7..5c85504 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -103,6 +103,10 @@ extern void kvmppc_booke_exit(void);
 
 extern void kvmppc_core_destroy_mmu(struct kvm_vcpu *vcpu);
 
+/* 16*NR_CPUS bytes filled with blr instructions. We use this to enable
+   code to execute arbitrary (checked!) opcodes. */
+extern u32 *kvmppc_call_stack;
+
 /*
  * Cuts out inst bits with ordering according to spec.
  * That means the leftmost bit is zero. All given bits are included.
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index f842d1d..272cb37 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -35,6 +35,8 @@
 /* #define EXIT_DEBUG_SIMPLE */
 /* #define DEBUG_EXT */
 
+u32 *kvmppc_call_stack;
+
 static int kvmppc_handle_ext(struct kvm_vcpu *vcpu, unsigned int exit_nr,
 ulong msr);
 
@@ -1249,7 +1251,17 @@ int __kvmppc_vcpu_run(struct kvm_run *kvm_run, struct 
kvm_vcpu *vcpu)
 
 static int kvmppc_book3s_init(void)
 {
-   return kvm_init(NULL, sizeof(struct kvmppc_vcpu_book3s), THIS_MODULE);
+   int r, i;
+
+   r = kvm_init(NULL, sizeof(struct kvmppc_vcpu_book3s), THIS_MODULE);
+
+   /* Prepare call blob we can use to execute single instructions */
+   kvmppc_call_stack = __vmalloc(NR_CPUS * 2 * sizeof(u32),
+   GFP_KERNEL | __GFP_HIGHMEM, PAGE_KERNEL_EXEC);
+   for (i = 0; i  (NR_CPUS * 2); i++)
+   kvmppc_call_stack[i] = INS_BLR;
+
+   return r;
 }
 
 static void kvmppc_book3s_exit(void)
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 15/18] KVM: PPC: Add helpers to modify ppc fields

2010-02-04 Thread Alexander Graf
The PowerPC specification always lists bits from MSB to LSB. That is
really confusing when you're trying to write C code, because it fits
in pretty badly with the normal (1  xx) schemes.

So I came up with some nice wrappers that allow to get and set fields
in a u64 with bit numbers exactly as given in the spec. That makes the
code in KVM and the spec easier comparable.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_ppc.h |   33 +
 1 files changed, 33 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index 0761218..c7fcdd7 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -103,6 +103,39 @@ extern void kvmppc_booke_exit(void);
 
 extern void kvmppc_core_destroy_mmu(struct kvm_vcpu *vcpu);
 
+/*
+ * Cuts out inst bits with ordering according to spec.
+ * That means the leftmost bit is zero. All given bits are included.
+ */
+static inline u32 kvmppc_get_field(u64 inst, int msb, int lsb)
+{
+   u32 r;
+   u32 mask;
+
+   BUG_ON(msb  lsb);
+
+   mask = (1  (lsb - msb + 1)) - 1;
+   r = (inst  (63 - lsb))  mask;
+
+   return r;
+}
+
+/*
+ * Replaces inst bits with ordering according to spec.
+ */
+static inline u32 kvmppc_set_field(u64 inst, int msb, int lsb, int value)
+{
+   u32 r;
+   u32 mask;
+
+   BUG_ON(msb  lsb);
+
+   mask = ((1  (lsb - msb + 1)) - 1)  (63 - lsb);
+   r = (inst  ~mask) | ((value  (63 - lsb))  mask);
+
+   return r;
+}
+
 #ifdef CONFIG_PPC_BOOK3S
 
 /* We assume we're always acting on the current vcpu */
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 03/18] KVM: PPC: Teach MMIO Signedness

2010-02-04 Thread Alexander Graf
The guest I was trying to get to run uses the LHA and LHAU instructions.
Those instructions basically do a load, but also sign extend the result.

Since we need to fill our registers by hand when doing MMIO, we also need
to sign extend manually.

This patch implements sign extended MMIO and the LHA(U) instructions.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_host.h |1 +
 arch/powerpc/include/asm/kvm_ppc.h  |3 +++
 arch/powerpc/kvm/emulate.c  |   14 ++
 arch/powerpc/kvm/powerpc.c  |   32 
 4 files changed, 50 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 2ed954e..4dd98fa 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -268,6 +268,7 @@ struct kvm_vcpu_arch {
 
u8 io_gpr; /* GPR used as IO source/target */
u8 mmio_is_bigendian;
+   u8 mmio_sign_extend;
u8 dcr_needed;
u8 dcr_is_write;
 
diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index c011170..a288dd2 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -48,6 +48,9 @@ extern void kvmppc_dump_vcpu(struct kvm_vcpu *vcpu);
 extern int kvmppc_handle_load(struct kvm_run *run, struct kvm_vcpu *vcpu,
   unsigned int rt, unsigned int bytes,
   int is_bigendian);
+extern int kvmppc_handle_loads(struct kvm_run *run, struct kvm_vcpu *vcpu,
+   unsigned int rt, unsigned int bytes,
+   int is_bigendian);
 extern int kvmppc_handle_store(struct kvm_run *run, struct kvm_vcpu *vcpu,
u64 val, unsigned int bytes, int is_bigendian);
 
diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c
index b905623..ef2ff59 100644
--- a/arch/powerpc/kvm/emulate.c
+++ b/arch/powerpc/kvm/emulate.c
@@ -62,6 +62,8 @@
 #define OP_STBU 39
 #define OP_LHZ  40
 #define OP_LHZU 41
+#define OP_LHA  42
+#define OP_LHAU 43
 #define OP_STH  44
 #define OP_STHU 45
 
@@ -450,6 +452,18 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct 
kvm_vcpu *vcpu)
kvmppc_set_gpr(vcpu, ra, vcpu-arch.paddr_accessed);
break;
 
+   case OP_LHA:
+   rt = get_rt(inst);
+   emulated = kvmppc_handle_loads(run, vcpu, rt, 2, 1);
+   break;
+
+   case OP_LHAU:
+   ra = get_ra(inst);
+   rt = get_rt(inst);
+   emulated = kvmppc_handle_loads(run, vcpu, rt, 2, 1);
+   kvmppc_set_gpr(vcpu, ra, vcpu-arch.paddr_accessed);
+   break;
+
case OP_STH:
rs = get_rs(inst);
emulated = kvmppc_handle_store(run, vcpu,
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 98d5e6d..a235369 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -300,6 +300,25 @@ static void kvmppc_complete_mmio_load(struct kvm_vcpu 
*vcpu,
}
}
 
+   if (vcpu-arch.mmio_sign_extend) {
+   switch (run-mmio.len) {
+#ifdef CONFIG_PPC64
+   case 4:
+   if (gpr  0x8000)
+   gpr |= 0xULL;
+   break;
+#endif
+   case 2:
+   if (gpr  0x8000)
+   gpr |= 0xULL;
+   break;
+   case 1:
+   if (gpr  0x80)
+   gpr |= 0xff00ULL;
+   break;
+   }
+   }
+
kvmppc_set_gpr(vcpu, vcpu-arch.io_gpr, gpr);
 
switch (vcpu-arch.io_gpr  REG_EXT_MASK) {
@@ -337,10 +356,23 @@ int kvmppc_handle_load(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
vcpu-arch.mmio_is_bigendian = is_bigendian;
vcpu-mmio_needed = 1;
vcpu-mmio_is_write = 0;
+   vcpu-arch.mmio_sign_extend = 0;
 
return EMULATE_DO_MMIO;
 }
 
+/* Same as above, but sign extends */
+int kvmppc_handle_loads(struct kvm_run *run, struct kvm_vcpu *vcpu,
+unsigned int rt, unsigned int bytes, int is_bigendian)
+{
+   int r;
+
+   r = kvmppc_handle_load(run, vcpu, rt, bytes, is_bigendian);
+   vcpu-arch.mmio_sign_extend = 1;
+
+   return r;
+}
+
 int kvmppc_handle_store(struct kvm_run *run, struct kvm_vcpu *vcpu,
 u64 val, unsigned int bytes, int is_bigendian)
 {
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/18] KVM: PPC: Enable MMIO to do 64 bits, fprs and qprs

2010-02-04 Thread Alexander Graf
Right now MMIO access can only happen for GPRs and is at most 32 bit wide.
That's actually enough for almost all types of hardware out there.

Unfortunately, the guest I was using used FPU writes to MMIO regions, so
it ended up writing 64 bit MMIOs using FPRs and QPRs.

So let's add code to handle those odd cases too.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm.h |7 +++
 arch/powerpc/include/asm/kvm_ppc.h |2 +-
 arch/powerpc/kvm/powerpc.c |   24 ++--
 3 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm.h b/arch/powerpc/include/asm/kvm.h
index 81f3b0b..548376c 100644
--- a/arch/powerpc/include/asm/kvm.h
+++ b/arch/powerpc/include/asm/kvm.h
@@ -77,4 +77,11 @@ struct kvm_debug_exit_arch {
 struct kvm_guest_debug_arch {
 };
 
+#define REG_MASK   0x001f
+#define REG_EXT_MASK   0xffe0
+#define REG_GPR0x
+#define REG_FPR0x0020
+#define REG_QPR0x0040
+#define REG_FQPR   0x0060
+
 #endif /* __LINUX_KVM_POWERPC_H */
diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index e264282..c011170 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -49,7 +49,7 @@ extern int kvmppc_handle_load(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
   unsigned int rt, unsigned int bytes,
   int is_bigendian);
 extern int kvmppc_handle_store(struct kvm_run *run, struct kvm_vcpu *vcpu,
-   u32 val, unsigned int bytes, int is_bigendian);
+   u64 val, unsigned int bytes, int is_bigendian);
 
 extern int kvmppc_emulate_instruction(struct kvm_run *run,
   struct kvm_vcpu *vcpu);
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 51aedd7..98d5e6d 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -277,7 +277,7 @@ static void kvmppc_complete_dcr_load(struct kvm_vcpu *vcpu,
 static void kvmppc_complete_mmio_load(struct kvm_vcpu *vcpu,
   struct kvm_run *run)
 {
-   ulong gpr;
+   u64 gpr;
 
if (run-mmio.len  sizeof(gpr)) {
printk(KERN_ERR bad MMIO length: %d\n, run-mmio.len);
@@ -286,6 +286,7 @@ static void kvmppc_complete_mmio_load(struct kvm_vcpu *vcpu,
 
if (vcpu-arch.mmio_is_bigendian) {
switch (run-mmio.len) {
+   case 8: gpr = *(u64 *)run-mmio.data; break;
case 4: gpr = *(u32 *)run-mmio.data; break;
case 2: gpr = *(u16 *)run-mmio.data; break;
case 1: gpr = *(u8 *)run-mmio.data; break;
@@ -300,6 +301,24 @@ static void kvmppc_complete_mmio_load(struct kvm_vcpu 
*vcpu,
}
 
kvmppc_set_gpr(vcpu, vcpu-arch.io_gpr, gpr);
+
+   switch (vcpu-arch.io_gpr  REG_EXT_MASK) {
+   case REG_GPR:
+   kvmppc_set_gpr(vcpu, vcpu-arch.io_gpr, gpr);
+   break;
+   case REG_FPR:
+   vcpu-arch.fpr[vcpu-arch.io_gpr  REG_MASK] = gpr;
+   break;
+   case REG_QPR:
+   vcpu-arch.qpr[vcpu-arch.io_gpr  REG_MASK] = gpr;
+   break;
+   case REG_FQPR:
+   vcpu-arch.fpr[vcpu-arch.io_gpr  REG_MASK] = gpr;
+   vcpu-arch.qpr[vcpu-arch.io_gpr  REG_MASK] = gpr;
+   break;
+   default:
+   BUG();
+   }
 }
 
 int kvmppc_handle_load(struct kvm_run *run, struct kvm_vcpu *vcpu,
@@ -323,7 +342,7 @@ int kvmppc_handle_load(struct kvm_run *run, struct kvm_vcpu 
*vcpu,
 }
 
 int kvmppc_handle_store(struct kvm_run *run, struct kvm_vcpu *vcpu,
-u32 val, unsigned int bytes, int is_bigendian)
+u64 val, unsigned int bytes, int is_bigendian)
 {
void *data = run-mmio.data;
 
@@ -341,6 +360,7 @@ int kvmppc_handle_store(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
/* Store the value at the lowest bytes in 'data'. */
if (is_bigendian) {
switch (bytes) {
+   case 8: *(u64 *)data = val; break;
case 4: *(u32 *)data = val; break;
case 2: *(u16 *)data = val; break;
case 1: *(u8  *)data = val; break;
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 11/18] KVM: PPC: Make software load/store return eaddr

2010-02-04 Thread Alexander Graf
The Book3S KVM implementation contains some helper functions to load and store
data from and to virtual addresses.

Unfortunately, this helper used to keep the physical address it so nicely
found out for us to itself. So let's change that and make it return the
physical address it resolved.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_book3s.h |4 +-
 arch/powerpc/kvm/book3s.c |   41 -
 arch/powerpc/kvm/book3s_64_emulate.c  |   11 +
 3 files changed, 33 insertions(+), 23 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index d28ee83..8463976 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -115,8 +115,8 @@ extern int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, 
struct kvmppc_pte *pte);
 extern int kvmppc_mmu_map_segment(struct kvm_vcpu *vcpu, ulong eaddr);
 extern void kvmppc_mmu_flush_segments(struct kvm_vcpu *vcpu);
 extern struct kvmppc_pte *kvmppc_mmu_find_pte(struct kvm_vcpu *vcpu, u64 ea, 
bool data);
-extern int kvmppc_ld(struct kvm_vcpu *vcpu, ulong eaddr, int size, void *ptr, 
bool data);
-extern int kvmppc_st(struct kvm_vcpu *vcpu, ulong eaddr, int size, void *ptr);
+extern int kvmppc_ld(struct kvm_vcpu *vcpu, ulong *eaddr, int size, void *ptr, 
bool data);
+extern int kvmppc_st(struct kvm_vcpu *vcpu, ulong *eaddr, int size, void *ptr, 
bool data);
 extern void kvmppc_book3s_queue_irqprio(struct kvm_vcpu *vcpu, unsigned int 
vec);
 extern void kvmppc_set_bat(struct kvm_vcpu *vcpu, struct kvmppc_bat *bat,
   bool upper, u32 val);
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 07f8b42..e8dccc6 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -439,55 +439,64 @@ err:
return kvmppc_bad_hva();
 }
 
-int kvmppc_st(struct kvm_vcpu *vcpu, ulong eaddr, int size, void *ptr)
+int kvmppc_st(struct kvm_vcpu *vcpu, ulong *eaddr, int size, void *ptr,
+ bool data)
 {
struct kvmppc_pte pte;
-   hva_t hva = eaddr;
+   hva_t hva = *eaddr;
 
vcpu-stat.st++;
 
-   if (kvmppc_xlate(vcpu, eaddr, false, pte))
-   goto err;
+   if (kvmppc_xlate(vcpu, *eaddr, data, pte))
+   goto nopte;
+
+   *eaddr = pte.raddr;
 
hva = kvmppc_pte_to_hva(vcpu, pte, false);
if (kvm_is_error_hva(hva))
-   goto err;
+   goto mmio;
 
if (copy_to_user((void __user *)hva, ptr, size)) {
printk(KERN_INFO kvmppc_st at 0x%lx failed\n, hva);
-   goto err;
+   goto mmio;
}
 
-   return 0;
+   return EMULATE_DONE;
 
-err:
+nopte:
return -ENOENT;
+mmio:
+   return EMULATE_DO_MMIO;
 }
 
-int kvmppc_ld(struct kvm_vcpu *vcpu, ulong eaddr, int size, void *ptr,
+int kvmppc_ld(struct kvm_vcpu *vcpu, ulong *eaddr, int size, void *ptr,
  bool data)
 {
struct kvmppc_pte pte;
-   hva_t hva = eaddr;
+   hva_t hva = *eaddr;
 
vcpu-stat.ld++;
 
-   if (kvmppc_xlate(vcpu, eaddr, data, pte))
-   goto err;
+   if (kvmppc_xlate(vcpu, *eaddr, data, pte))
+   goto nopte;
+
+   *eaddr = pte.raddr;
 
hva = kvmppc_pte_to_hva(vcpu, pte, true);
if (kvm_is_error_hva(hva))
-   goto err;
+   goto mmio;
 
if (copy_from_user(ptr, (void __user *)hva, size)) {
printk(KERN_INFO kvmppc_ld at 0x%lx failed\n, hva);
-   goto err;
+   goto mmio;
}
 
-   return 0;
+   return EMULATE_DONE;
 
-err:
+nopte:
return -ENOENT;
+mmio:
+   return EMULATE_DO_MMIO;
 }
 
 static int kvmppc_visible_gfn(struct kvm_vcpu *vcpu, gfn_t gfn)
diff --git a/arch/powerpc/kvm/book3s_64_emulate.c 
b/arch/powerpc/kvm/book3s_64_emulate.c
index e4e7ec3..a93aa47 100644
--- a/arch/powerpc/kvm/book3s_64_emulate.c
+++ b/arch/powerpc/kvm/book3s_64_emulate.c
@@ -169,7 +169,7 @@ int kvmppc_core_emulate_op(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
{
ulong rb = kvmppc_get_gpr(vcpu, get_rb(inst));
ulong ra = 0;
-   ulong addr;
+   ulong addr, vaddr;
u32 zeros[8] = { 0, 0, 0, 0, 0, 0, 0, 0 };
 
if (get_ra(inst))
@@ -178,15 +178,16 @@ int kvmppc_core_emulate_op(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
addr = (ra + rb)  ~31ULL;
if (!(vcpu-arch.msr  MSR_SF))
addr = 0x;
+   vaddr = addr;
 
-   if (kvmppc_st(vcpu, addr, 32, zeros)) {
-   vcpu-arch.dear = addr;
-   vcpu-arch.fault_dear = addr;
+   if (kvmppc_st(vcpu, addr, 32, 

[PATCH 06/18] KVM: PPC: Add Gekko SPRs

2010-02-04 Thread Alexander Graf
The Gekko has some SPR values that differ from other PPC core values and
also some additional ones.

Let's add support for them in our mfspr/mtspr emulator.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_book3s.h |1 +
 arch/powerpc/include/asm/reg.h|   10 +
 arch/powerpc/kvm/book3s_64_emulate.c  |   70 +
 3 files changed, 81 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index db7db0a..d28ee83 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -82,6 +82,7 @@ struct kvmppc_vcpu_book3s {
struct kvmppc_bat ibat[8];
struct kvmppc_bat dbat[8];
u64 hid[6];
+   u64 gqr[8];
int slb_nr;
u64 sdr1;
u64 dsisr;
diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index 5572e86..8a69a39 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -293,10 +293,12 @@
 #define HID1_ABE   (110) /* 7450 Address Broadcast Enable */
 #define HID1_PS(116) /* 750FX PLL selection */
 #define SPRN_HID2  0x3F8   /* Hardware Implementation Register 2 */
+#define SPRN_HID2_GEKKO0x398   /* Gekko HID2 Register */
 #define SPRN_IABR  0x3F2   /* Instruction Address Breakpoint Register */
 #define SPRN_IABR2 0x3FA   /* 83xx */
 #define SPRN_IBCR  0x135   /* 83xx Insn Breakpoint Control Reg */
 #define SPRN_HID4  0x3F4   /* 970 HID4 */
+#define SPRN_HID4_GEKKO0x3F3   /* Gekko HID4 */
 #define SPRN_HID5  0x3F6   /* 970 HID5 */
 #define SPRN_HID6  0x3F9   /* BE HID 6 */
 #define   HID6_LB  (0x0F12) /* Concurrent Large Page Modes */
@@ -465,6 +467,14 @@
 #define SPRN_VRSAVE0x100   /* Vector Register Save Register */
 #define SPRN_XER   0x001   /* Fixed Point Exception Register */
 
+#define SPRN_MMCR0_GEKKO 0x3B8 /* Gekko Monitor Mode Control Register 0 */
+#define SPRN_MMCR1_GEKKO 0x3BC /* Gekko Monitor Mode Control Register 1 */
+#define SPRN_PMC1_GEKKO  0x3B9 /* Gekko Performance Monitor Control 1 */
+#define SPRN_PMC2_GEKKO  0x3BA /* Gekko Performance Monitor Control 2 */
+#define SPRN_PMC3_GEKKO  0x3BD /* Gekko Performance Monitor Control 3 */
+#define SPRN_PMC4_GEKKO  0x3BE /* Gekko Performance Monitor Control 4 */
+#define SPRN_WPAR_GEKKO  0x399 /* Gekko Write Pipe Address Register */
+
 #define SPRN_SCOMC 0x114   /* SCOM Access Control */
 #define SPRN_SCOMD 0x115   /* SCOM Access DATA */
 
diff --git a/arch/powerpc/kvm/book3s_64_emulate.c 
b/arch/powerpc/kvm/book3s_64_emulate.c
index 2b0ee7e..bb4a7c1 100644
--- a/arch/powerpc/kvm/book3s_64_emulate.c
+++ b/arch/powerpc/kvm/book3s_64_emulate.c
@@ -42,6 +42,15 @@
 /* DCBZ is actually 1014, but we patch it to 1010 so we get a trap */
 #define OP_31_XOP_DCBZ 1010
 
+#define SPRN_GQR0  912
+#define SPRN_GQR1  913
+#define SPRN_GQR2  914
+#define SPRN_GQR3  915
+#define SPRN_GQR4  916
+#define SPRN_GQR5  917
+#define SPRN_GQR6  918
+#define SPRN_GQR7  919
+
 int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu,
unsigned int inst, int *advance)
 {
@@ -268,7 +277,29 @@ int kvmppc_core_emulate_mtspr(struct kvm_vcpu *vcpu, int 
sprn, int rs)
case SPRN_HID2:
to_book3s(vcpu)-hid[2] = spr_val;
break;
+   case SPRN_HID2_GEKKO:
+   to_book3s(vcpu)-hid[2] = spr_val;
+   /* HID2.PSE controls paired single on gekko */
+   switch (vcpu-arch.pvr) {
+   case 0x00080200:/* lonestar 2.0 */
+   case 0x00088202:/* lonestar 2.2 */
+   case 0x7100:/* gekko 1.0 */
+   case 0x00080100:/* gekko 2.0 */
+   case 0x00083203:/* gekko 2.3a */
+   case 0x00083213:/* gekko 2.3b */
+   case 0x00083204:/* gekko 2.4 */
+   case 0x00083214:/* gekko 2.4e (8SE) - retail HW2 */
+   if (spr_val  (1  29)) { /* HID2.PSE */
+   vcpu-arch.hflags |= BOOK3S_HFLAG_PAIRED_SINGLE;
+   kvmppc_giveup_ext(vcpu, MSR_FP);
+   } else {
+   vcpu-arch.hflags = 
~BOOK3S_HFLAG_PAIRED_SINGLE;
+   }
+   break;
+   }
+   break;
case SPRN_HID4:
+   case SPRN_HID4_GEKKO:
to_book3s(vcpu)-hid[4] = spr_val;
break;
case SPRN_HID5:
@@ -278,12 +309,30 @@ int kvmppc_core_emulate_mtspr(struct kvm_vcpu *vcpu, int 
sprn, int rs)
(mfmsr()  MSR_HV))

[PATCH 00/18] KVM: PPC: Virtualize Gekko guests

2010-02-04 Thread Alexander Graf
In an effort to get KVM on PPC more useful for other userspace users than
Qemu, I figured it'd be a nice idea to implement virtualization of the
Gekko CPU.

The Gekko is the CPU used in the GameCube. In a slightly more modern
fashion it lives on in the Wii today.

Using this patch set and a modified version of Dolphin, I was able to
virtualize simple GameCube demos on a 970MP system.

As always, while getting this to run I stumbled across several broken
parts and fixed them as they came up. So expect some bug fixes in this
patch set too.

Alexander Graf (18):
  KVM: PPC: Add QPR registers
  KVM: PPC: Enable MMIO to do 64 bits, fprs and qprs
  KVM: PPC: Teach MMIO Signedness
  KVM: PPC: Add AGAIN type for emulation return
  KVM: PPC: Add hidden flag for paired singles
  KVM: PPC: Add Gekko SPRs
  KVM: PPC: Combine extension interrupt handlers
  KVM: PPC: Preload FPU when possible
  KVM: PPC: Fix typo in book3s_32 debug code
  KVM: PPC: Implement mtsr instruction emulation
  KVM: PPC: Make software load/store return eaddr
  KVM: PPC: Make ext giveup non-static
  KVM: PPC: Add helpers to call FPU instructions
  KVM: PPC: Fix error in BAT assignment
  KVM: PPC: Add helpers to modify ppc fields
  KVM: PPC: Enable program interrupt to do MMIO
  KVM: PPC: Reserve a chunk of memory for opcodes
  KVM: PPC: Implement Paired Single emulation

 arch/powerpc/include/asm/kvm.h   |7 +
 arch/powerpc/include/asm/kvm_asm.h   |1 +
 arch/powerpc/include/asm/kvm_book3s.h|8 +-
 arch/powerpc/include/asm/kvm_fpu.h   |   45 +
 arch/powerpc/include/asm/kvm_host.h  |6 +
 arch/powerpc/include/asm/kvm_ppc.h   |   43 +-
 arch/powerpc/include/asm/reg.h   |   10 +
 arch/powerpc/kernel/ppc_ksyms.c  |2 +
 arch/powerpc/kvm/Makefile|2 +
 arch/powerpc/kvm/book3s.c|  132 +++-
 arch/powerpc/kvm/book3s_32_mmu.c |2 +-
 arch/powerpc/kvm/book3s_64_emulate.c |   94 ++-
 arch/powerpc/kvm/book3s_paired_singles.c | 1356 ++
 arch/powerpc/kvm/emulate.c   |   18 +-
 arch/powerpc/kvm/fpu.S   |   77 ++
 arch/powerpc/kvm/powerpc.c   |   56 ++-
 16 files changed, 1821 insertions(+), 38 deletions(-)
 create mode 100644 arch/powerpc/include/asm/kvm_fpu.h
 create mode 100644 arch/powerpc/kvm/book3s_paired_singles.c
 create mode 100644 arch/powerpc/kvm/fpu.S

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 07/18] KVM: PPC: Combine extension interrupt handlers

2010-02-04 Thread Alexander Graf
When we for example get an Altivec interrupt, but our guest doesn't support
altivec, we need to inject a program interrupt, not an altivec interrupt.

The same goes for paired singles. When an altivec interrupt arrives, we're
pretty sure we need to emulate the instruction because it's a paired single
operation.

So let's make all the ext handlers aware that they need to jump to the
program interrupt handler when an extension interrupt arrives that
was not supposed to arrive for the guest CPU.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s.c |   55 
 1 files changed, 50 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 96f7be4..6bdf7f2 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -36,6 +36,8 @@
 /* #define DEBUG_EXT */
 
 static void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong msr);
+static int kvmppc_handle_ext(struct kvm_vcpu *vcpu, unsigned int exit_nr,
+ulong msr);
 
 struct kvm_stats_debugfs_item debugfs_entries[] = {
{ exits,   VCPU_STAT(sum_exits) },
@@ -628,6 +630,30 @@ static void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong 
msr)
kvmppc_recalc_shadow_msr(vcpu);
 }
 
+static int kvmppc_check_ext(struct kvm_vcpu *vcpu, unsigned int exit_nr)
+{
+   ulong srr0 = vcpu-arch.pc;
+   int ret;
+
+   /* Need to do paired single emulation? */
+   if (!(vcpu-arch.hflags  BOOK3S_HFLAG_PAIRED_SINGLE))
+   return EMULATE_DONE;
+
+   /* Read out the instruction */
+   ret = kvmppc_ld(vcpu, srr0, sizeof(u32), vcpu-arch.last_inst, false);
+   if (ret == -ENOENT) {
+   vcpu-arch.msr = kvmppc_set_field(vcpu-arch.msr, 33, 33, 1);
+   vcpu-arch.msr = kvmppc_set_field(vcpu-arch.msr, 34, 36, 0);
+   vcpu-arch.msr = kvmppc_set_field(vcpu-arch.msr, 42, 47, 0);
+   kvmppc_book3s_queue_irqprio(vcpu, 
BOOK3S_INTERRUPT_INST_STORAGE);
+   } else if(ret == EMULATE_DONE) {
+   /* Need to emulate */
+   return EMULATE_FAIL;
+   }
+
+   return EMULATE_AGAIN;
+}
+
 /* Handle external providers (FPU, Altivec, VSX) */
 static int kvmppc_handle_ext(struct kvm_vcpu *vcpu, unsigned int exit_nr,
 ulong msr)
@@ -772,6 +798,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu 
*vcpu,
enum emulation_result er;
ulong flags;
 
+program_interrupt:
flags = vcpu-arch.shadow_srr1  0x1full;
 
if (vcpu-arch.msr  MSR_PR) {
@@ -815,14 +842,32 @@ int kvmppc_handle_exit(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
r = RESUME_GUEST;
break;
case BOOK3S_INTERRUPT_FP_UNAVAIL:
-   r = kvmppc_handle_ext(vcpu, exit_nr, MSR_FP);
-   break;
case BOOK3S_INTERRUPT_ALTIVEC:
-   r = kvmppc_handle_ext(vcpu, exit_nr, MSR_VEC);
-   break;
case BOOK3S_INTERRUPT_VSX:
-   r = kvmppc_handle_ext(vcpu, exit_nr, MSR_VSX);
+   {
+   int ext_msr = 0;
+
+   switch (exit_nr) {
+   case BOOK3S_INTERRUPT_FP_UNAVAIL: ext_msr = MSR_FP;  break;
+   case BOOK3S_INTERRUPT_ALTIVEC:ext_msr = MSR_VEC; break;
+   case BOOK3S_INTERRUPT_VSX:ext_msr = MSR_VSX; break;
+   }
+
+   switch (kvmppc_check_ext(vcpu, exit_nr)) {
+   case EMULATE_DONE:
+   /* everything ok - let's enable the ext */
+   r = kvmppc_handle_ext(vcpu, exit_nr, ext_msr);
+   break;
+   case EMULATE_FAIL:
+   /* we need to emulate this instruction */
+   goto program_interrupt;
+   break;
+   default:
+   /* nothing to worry about - go again */
+   break;
+   }
break;
+   }
case BOOK3S_INTERRUPT_MACHINE_CHECK:
case BOOK3S_INTERRUPT_TRACE:
kvmppc_book3s_queue_irqprio(vcpu, exit_nr);
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


guest - guest communication problem

2010-02-04 Thread Dean Rantala

Very sorry for bugin' you all, but I have an issue someone may be able
to fix. 

I have 3 guests on a single host. This host is using bridged
networking. The guest VMs can talk to the host and any other computer on
the hosts network. They can even get outside to the internet. 

Likewise,
the host - and any other computer on the hosts network can communicate to
the guests fine and without problem. 

However, none of the guests can talk
to another guest. I have checked and they are all on the same bridge (each
machine using it's own tap interface) and all use the same vnet. 

Heres an
example command I am using: 

kvm -drive
file=disk.qcow2,index=0,media=disk,if=virtio,boot=on -m 512 -net
tap,vlan=0,script=no,ifname=tap0,downscript=no -net
nic,vlan=0,model=virtio
-net tap,vlan=1,script=no,ifname=tap1,downscript=no -net
nic,vlan=1,model=virtio -smp 1 -monitor unix:/monitor,server,nowait -vnc
0.0.0.0:2 -name ns1 -pidfile pid -no-shutdown -daemonize 

I have ensured
that iptables is correct: 

iptables -A INPUT -i tap0 -j ACCEPT 

iptables
-A INPUT -i tap0 -j ACCEPT 

Note: this is Debian Lenny. Strangely enough,
this worked fine under my desktop tests (Ubuntu 9.10) using the same exact
config.

-- 
Dean M. Rantala
Programmer / Server Administrator
Cell: (931) 284-7384
Home: (931) 268-4763
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] emulate accessed bit for EPT

2010-02-04 Thread Jeff Dike
On Wed, Feb 03, 2010 at 04:11:03PM -0500, Rik van Riel wrote:
 Jeff, does this patch fix the issue you saw a few months ago, with
 a 256MB KVM guest in a cgroup limited to 128GB memory?

Hum, let me dust off that workload and give it a shot...

Jeff
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[49/74] KVM: allow userspace to adjust kvmclock offset

2010-02-04 Thread Greg KH
2.6.32-stable review patch.  If anyone has any objections, please let us know.

--

From: Glauber Costa glom...@redhat.com

(cherry picked from afbcf7ab8d1bc8c2d04792f6d9e786e0adeb328d)

When we migrate a kvm guest that uses pvclock between two hosts, we may
suffer a large skew. This is because there can be significant differences
between the monotonic clock of the hosts involved. When a new host with
a much larger monotonic time starts running the guest, the view of time
will be significantly impacted.

Situation is much worse when we do the opposite, and migrate to a host with
a smaller monotonic clock.

This proposed ioctl will allow userspace to inform us what is the monotonic
clock value in the source host, so we can keep the time skew short, and
more importantly, never goes backwards. Userspace may also need to trigger
the current data, since from the first migration onwards, it won't be
reflected by a simple call to clock_gettime() anymore.

[marcelo: future-proof abi with a flags field]
[jan: fix KVM_GET_CLOCK by clearing flags field instead of checking it]

Signed-off-by: Glauber Costa glom...@redhat.com
Signed-off-by: Marcelo Tosatti mtosa...@redhat.com
Signed-off-by: Avi Kivity a...@redhat.com
Signed-off-by: Greg Kroah-Hartman gre...@suse.de

---
 Documentation/kvm/api.txt   |   36 ++
 arch/x86/include/asm/kvm_host.h |1 
 arch/x86/kvm/x86.c  |   42 +++-
 include/linux/kvm.h |9 
 4 files changed, 87 insertions(+), 1 deletion(-)

--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -412,6 +412,7 @@ struct kvm_arch{
unsigned long irq_sources_bitmap;
unsigned long irq_states[KVM_IOAPIC_NUM_PINS];
u64 vm_init_tsc;
+   s64 kvmclock_offset;
 };
 
 struct kvm_vm_stat {
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -680,7 +680,8 @@ static void kvm_write_guest_time(struct 
/* With all the info we got, fill in the values */
 
vcpu-hv_clock.system_time = ts.tv_nsec +
-(NSEC_PER_SEC * (u64)ts.tv_sec);
+(NSEC_PER_SEC * (u64)ts.tv_sec) + 
v-kvm-arch.kvmclock_offset;
+
/*
 * The interface expects us to write an even number signaling that the
 * update is finished. Since the guest won't see the intermediate
@@ -1227,6 +1228,7 @@ int kvm_dev_ioctl_check_extension(long e
case KVM_CAP_PIT2:
case KVM_CAP_PIT_STATE2:
case KVM_CAP_SET_IDENTITY_MAP_ADDR:
+   case KVM_CAP_ADJUST_CLOCK:
r = 1;
break;
case KVM_CAP_COALESCED_MMIO:
@@ -2424,6 +2426,44 @@ long kvm_arch_vm_ioctl(struct file *filp
r = 0;
break;
}
+   case KVM_SET_CLOCK: {
+   struct timespec now;
+   struct kvm_clock_data user_ns;
+   u64 now_ns;
+   s64 delta;
+
+   r = -EFAULT;
+   if (copy_from_user(user_ns, argp, sizeof(user_ns)))
+   goto out;
+
+   r = -EINVAL;
+   if (user_ns.flags)
+   goto out;
+
+   r = 0;
+   ktime_get_ts(now);
+   now_ns = timespec_to_ns(now);
+   delta = user_ns.clock - now_ns;
+   kvm-arch.kvmclock_offset = delta;
+   break;
+   }
+   case KVM_GET_CLOCK: {
+   struct timespec now;
+   struct kvm_clock_data user_ns;
+   u64 now_ns;
+
+   ktime_get_ts(now);
+   now_ns = timespec_to_ns(now);
+   user_ns.clock = kvm-arch.kvmclock_offset + now_ns;
+   user_ns.flags = 0;
+
+   r = -EFAULT;
+   if (copy_to_user(argp, user_ns, sizeof(user_ns)))
+   goto out;
+   r = 0;
+   break;
+   }
+
default:
;
}
--- a/Documentation/kvm/api.txt
+++ b/Documentation/kvm/api.txt
@@ -593,6 +593,42 @@ struct kvm_irqchip {
} chip;
 };
 
+4.27 KVM_GET_CLOCK
+
+Capability: KVM_CAP_ADJUST_CLOCK
+Architectures: x86
+Type: vm ioctl
+Parameters: struct kvm_clock_data (out)
+Returns: 0 on success, -1 on error
+
+Gets the current timestamp of kvmclock as seen by the current guest. In
+conjunction with KVM_SET_CLOCK, it is used to ensure monotonicity on scenarios
+such as migration.
+
+struct kvm_clock_data {
+   __u64 clock;  /* kvmclock current value */
+   __u32 flags;
+   __u32 pad[9];
+};
+
+4.28 KVM_SET_CLOCK
+
+Capability: KVM_CAP_ADJUST_CLOCK
+Architectures: x86
+Type: vm ioctl
+Parameters: struct kvm_clock_data (in)
+Returns: 0 on success, -1 on error
+
+Sets the current timestamp of kvmclock to the valued specific in its parameter.
+In conjunction with KVM_GET_CLOCK, it is used to ensure monotonicity on 
scenarios
+such as migration.

Re: KVM RAM limitation

2010-02-04 Thread Daniel Bareiro
Hi, Brian.

On Wednesday, 03 February 2010 16:44:28 -0600,
Brian Jackson wrote:

  Anthony Liguori wrote:
   Are you sure you enabled KVM? Are you sure you are using the KVM
   binary and not some QEMU binary that's sitting around. This is one
   of those situations where the KVM command you are running might
   help.  Also the same binary you are running's version ($QEMU_BIN -h
   
   | head -n1)

   wilson:/usr/local/qemu-kvm/bin# ./qemu-system-x86_64 -h | head -n1
   QEMU PC emulator version 0.12.2 (qemu-kvm-0.12.2), Copyright (c)
   2003-2008 Fabrice Bellard
   
   
   The procedure that I used to compile qemu-kvm is the same of always:
   to download qemu-kvm-0.12.2, to install the packages (Debian)
   zlib1g-dev and libpci-dev, and to compile of the following way:
   
   # cd qemu-kvm-0.12.2
   # ./configure --prefix=/usr/local/qemu-kvm
   # make
   # make install
   
   Until the moment I never got to use qemu-kvm with VMs of more than
   2048 MB. In an installation that I have with KVM-88 and kernel x86_64
   I don't have this problem.

   QEMU and KVM only support 2GB of memory on a 32-bit host.
   
   Both need to create a userspace mapping of the guests memory.  In a
   32-bit environment, you only have enough usable address space in a
   process to create a 2GB region.
 
  But, according to what I read in the link [1] that commented, just by to
  have a x86_64 kernel would have to be sufficient to serve more than 2047
  MB of RAM.

 The kvm userspace would also have to be compiled as a 64bit binary.
 Possibly statically compiled somewhere else (if that's even possible)
 or with a 64bit chroot.

Hmmm... and there is some way to compile qemu-kvm as a 64bit binary on a
operating system userspace of 32bit?

I tried with ARCH=x86_64 with make but when using this I obtain several
messages of the type cast to/from pointer from/to integer of different
size.

Thanks for your reply.

Regards,
Daniel
-- 
Mi frase del día:
An algorithm must be seen to be believed.
-- D. E. Knuth

Daniel Bareiro - GNU/Linux registered user #188.598
Proudly running Debian GNU/Linux with uptime:
14:02:40 up 31 days, 22:47, 11 users,  load average: 0.00, 0.02, 0.00


signature.asc
Description: Digital signature


Re: [PATCH] emulate accessed bit for EPT

2010-02-04 Thread Andrea Arcangeli
On Thu, Feb 04, 2010 at 08:40:43AM -0500, Rik van Riel wrote:
 I suspect it won't be very many. I have been monitoring
 /proc/meminfo on my system while testing this patch, and
 it is quite typical that the size of the inactive anon
 list does not change for minutes at a time.
 
 In other words, no pages are moved onto or off of the
 inactive anon list for several minutes. That corresponds
 to a very small number of minor faults introduced by my
 patch.

When there's light VM pressure, ideally there should be zero overhead
caused by the patch. When there is VM pressure this will avoid some
unnecessary I/O which should outweight the minor faults. It should be
a good default behavior.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/8] KVM: Activate fpu on clts

2010-02-04 Thread Avi Kivity

On 02/04/2010 03:11 PM, Gleb Natapov wrote:

On Thu, Feb 04, 2010 at 03:05:17PM +0200, Avi Kivity wrote:
   

On 02/02/2010 10:16 AM, Paolo Bonzini wrote:
 

On 01/21/2010 02:31 PM, Avi Kivity wrote:
   

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index feca59f..09207ba 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3266,6 +3266,7 @@ int emulate_invlpg(struct kvm_vcpu *vcpu,
gva_t address)
  int emulate_clts(struct kvm_vcpu *vcpu)
  {
  kvm_x86_ops-set_cr0(vcpu, kvm_read_cr0_bits(vcpu, ~X86_CR0_TS));
+kvm_x86_ops-fpu_activate(vcpu);
  return X86EMUL_CONTINUE;
  }
 

Can this code be reached if CLTS is executed in real mode?  That
would cause a NULL-pointer access on VMX.
   

How would this cause a null pointer access?

 

vmx.c doesn't initialize kvm_x86_ops-fpu_activate as far as I see.
   


Gaak.  Well, that's obviously unintended.

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


ESX on KVM

2010-02-04 Thread Brian Kelly
Hello -
This is a topic which has been covered in the past that i'd like to
bump for my own sanity check...
Currently when i boot esx it hangs on loading install.tgz
booting:mbi=0x00010090, entry=0x00100212
Followed by a purple screen of death - #GP Exception(13) in world 0

My config looks something like this:
@ubuntuserv:/usr/local/kvm/bin$ uname -a
Linux ubuntuserv 2.6.31-17-server #54-Ubuntu SMP Thu Dec 10 18:06:56
UTC 2009 x86_64 GNU/Linux
@ubuntuserv:/usr/local/kvm/bin$ qemu-system-x86_64 --versionQEMU PC
emulator version 0.12.2 (qemu-kvm-0.12.2), Copyright (c) 2003-2008
Fabrice Bellard
@ubuntuserv:/usr/local/kvm/bin$ sudo  qemu-system-x86_64 -m 2048 -hda
disk.img -cdrom /dev/cdrom -net nic -net user -enable-kvm -boot d -cpu
phenom

Any insights as to where i've gone wrong?

Thanks!
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] KVM: Rework of guest debug state writing

2010-02-04 Thread Jan Kiszka
Marcelo Tosatti wrote:
 On Thu, Feb 04, 2010 at 04:41:44PM +0100, Jan Kiszka wrote:
 Jan Kiszka wrote:
 Marcelo Tosatti wrote:
 On Thu, Feb 04, 2010 at 01:33:50AM +0100, Jan Kiszka wrote:
 Marcelo Tosatti wrote:
 On Wed, Feb 03, 2010 at 10:29:45PM +0100, Jan Kiszka wrote:
 So far we synchronized any dirty VCPU state back into the kernel before
 updating the guest debug state. This was a tribute to a deficit in x86
 kernels before 2.6.33. But as this is an arch-dependent issue, it is
 better handle in the x86 part of KVM and remove the writeback point for
 generic code.
 Jan,

 This patch breaks migration.
 Can you elaborate what you did? I can't reproduce, and I do not see any
 conceptual issue (given that guest debugging conflicts with migration
 anyway).
 kvm-autotest fails (migration only, install is ok, both Linux and Win
 guests). Not sure why, perhaps the unconditional KVM_SET_GUEST_DEBUG
 corrupts state somehow? 

 Tested with io thread enabled.
 That's this default-off thing, so... OK, confirmed, investigating.

 Heisenbug: It first also popped up (in form of a frozen migration
 target) after removing this patch, but now it's totally unreproducible,
 whatever patch I apply or revert from my series. Base is current master.

 I tend to think there is a hidden issue of iothread vs. migration,
 unrelated to this patch.
 
 Probably many :)
 
 Do you have c5f32c99c6855d466737daf1cd262e7e92062f87 (from qemu-kvm.git
 uq/master) in?

Yes. And that might have been the reason why some early tests failed
when it was no yet applied here.

 
 With kvm-autotest the failure is not sporadic (and the above commit
 applied): with KVM_SET_GUEST_DEBUG in arch_put_regs all migration 
 tests fail, without, all of them succeed. 
 
 So env-kvm_guest_debug has been zeroed by cpu_x86_init, which means
 the writeback via KVM_SET_GUEST_DEBUG does almost nothing. It does
 get_rflags and set_rflags in the kernel.

Hmm, it also copies debug regs around... BTW, where do we save/restore
dr0..7 between kernel and user space?

But that should not be a problem, both shadow as well as effective regs
should be properly initialized, specifically for a newly created VCPU.

 
 Test box is off, but the synchronous writeback via qemu_system_reset
 in main, after machine and vcpu thread initialization, might be
 problematic. But it would be nice to understand this.
 
 Unrelated to this problem, won't put_vcpu_events, which is executed 
 after KVM_SET_GUEST_DEBUG, overwrite any queued debug exceptions?

Good point, SET_GUEST_DEBUG should be last in the writeback for that reason.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] KVM: Rework of guest debug state writing

2010-02-04 Thread Jan Kiszka
Jan Kiszka wrote:
 Marcelo Tosatti wrote:

 Unrelated to this problem, won't put_vcpu_events, which is executed 
 after KVM_SET_GUEST_DEBUG, overwrite any queued debug exceptions?
 
 Good point, SET_GUEST_DEBUG should be last in the writeback for that reason.

Actually, we no longer need the exception injection via SET_GUEST_DEBUG
now that we have full access via vcpu_events. So this needs a cleanup,
and I'm afraid quite a few cases are broken ATM with vcpu_events
writeback overwriting the reinjected exceptions.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Question about guest MSR loading/saving (Intel)

2010-02-04 Thread Avi Kivity

On 02/02/2010 03:57 AM, Kurt Kiefer wrote:

Hi all,

This is a vague/general question. For some background: I have a reason 
(control of IA32_PERF_GLOBAL_CTRL) for loading/saving MSRs on 
VM-entry/exit. To get this to work correctly, I made changes to use 
the conventional VMX MSR load areas of the VMCS for this particular 
MSR. Works great.


Is there a particular reason why MSRs are currently loaded/saved 
through KVM's unconventional facilities (vmx.c:save_msrs(), 
vmx.c:load_msrs()), rather than through VM entry/exit MSR load regions 
in the VMCS? I see that only long mode guests on x86_64 are effected 
by this.


Any insight could be useful. Do you think MSR loading via VMCS would 
be faster? Are there downsides to doing it one way or the other?


kvm doesn't switch msrs on every entry/exit.  For example, the syscall 
msrs are only used by the syscall/sysret instructions, so we only switch 
them before returning to userspace, which happens much less frequently 
than vmexits.


The PMU is used by the processor at all times, so it makes perfect sense 
to use the vmx autoload/autosave regions for that.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Autotest] [RFC] KVM test: Ship rss.exe and finish.exe binaries with KVM test

2010-02-04 Thread Uri Lublin

On 02/03/2010 04:25 PM, Michael Goldish wrote:


- Uri Lublinu...@redhat.com  wrote:


On 02/02/2010 01:48 PM, Lucas Meneghel Rodrigues wrote:

Hi folks:

We're on an effort of streamlining the KVM test experience, by

choosing

sane defaults and helper scripts that can overcome the initial

barrier

with getting the KVM test running. On one of the conversations I've

had

today, we came up with the idea of shipping the compiled windows
programs rss.exe and finish.exe, needed for windows hosts testing.

Even though rss.exe and finish.exe can be compiled in a fairly
straightforward way using the awesome cross compiling environment

with

mingw, there are some obvious limitations to it:

1) The cross compiling environment is only available for fedora=

11.

No other distros I know have it.

2) Sometimes it might take time for the user to realize he/she has

to

compile the source code under unattended/ folder, and how to do it.

That person would take a couple of failed attempts scratching

his/her

head thinking what the heck is this deps/finish.exe they're

talking

about?. Surely documentation can help, and I am looking at making

the

documentation on how to do it more easily discoverable.

That said, shipping the binaries would make the life of those

people

easier, and anyway the binaries work pretty well across all versions

of

windows from winxp to win7, they are self contained, with no

external

dependencies (they all use the standard win32 API).

3) That said we also need a script that can build the entire
winutils.iso without making the user to spend way too much time

figuring

out how to do it. I want to work on such a script on the next days.

So, what are your opinions? Should we ship the binaries or pursue a
script that can build those for the user as soon as the (yet to be
integrated) get_started.py script runs? Remember that the later

might

mean users of RHEL= 5.X and debian like will be left out in the

cold.

4) Another option is to make winutils.iso available (somewhere on the
web), and
download it in get_started.py (similar to other iso images used by kvm
test).


But isn't there a legal problem with that?
winutils.iso contains VLC media player (for the timedrift test).
If there's no legal problem, this sounds like the best option to me.


You may be right (although I think VLC is GPL).
I meant only for rss.exe and finish.exe.
Other components such as VLC media player can be downloaded separately.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/3] KVM: x86: add ioctls to get/set PIO state

2010-02-04 Thread Avi Kivity

On 01/28/2010 09:03 PM, Marcelo Tosatti wrote:

A vcpu can be stopped after handling IO in userspace,
but before returning to kernel to finish processing.

   


Is this strictly needed?  If we teach qemu to migrate before executing 
the pio request, I think we'll be all right?  should work at least for 
IN/INS, not sure about OUT/OUTS.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] KVM: Rework of guest debug state writing

2010-02-04 Thread Jan Kiszka
Jan Kiszka wrote:
 Marcelo Tosatti wrote:
 With kvm-autotest the failure is not sporadic (and the above commit
 applied): with KVM_SET_GUEST_DEBUG in arch_put_regs all migration 
 tests fail, without, all of them succeed. 

 So env-kvm_guest_debug has been zeroed by cpu_x86_init, which means
 the writeback via KVM_SET_GUEST_DEBUG does almost nothing. It does
 get_rflags and set_rflags in the kernel.
 
 Hmm, it also copies debug regs around... BTW, where do we save/restore
 dr0..7 between kernel and user space?
 
 But that should not be a problem, both shadow as well as effective regs
 should be properly initialized, specifically for a newly created VCPU.

Could you retry after pushing SET_GUEST_DEBUG at the end of
kvm_arch_put_registers? Maybe it is no good idea to run get/set_rflags
without having the sregs properly initialized.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Autotest] [RFC] KVM test: Ship rss.exe and finish.exe binaries with KVM test

2010-02-04 Thread Lucas Meneghel Rodrigues
On Thu, Feb 4, 2010 at 5:13 PM, Uri Lublin u...@redhat.com wrote:
 On 02/03/2010 04:25 PM, Michael Goldish wrote:
 4) Another option is to make winutils.iso available (somewhere on the
 web), and
 download it in get_started.py (similar to other iso images used by kvm
 test).

 But isn't there a legal problem with that?
 winutils.iso contains VLC media player (for the timedrift test).
 If there's no legal problem, this sounds like the best option to me.

 You may be right (although I think VLC is GPL).
 I meant only for rss.exe and finish.exe.
 Other components such as VLC media player can be downloaded separately.

Well, today I re-vamped the documentation for the kvm test, and there
I explained how to get all the components of the iso

http://www.linux-kvm.org/page/KVM-Autotest/Client_Install

And I really really want to ship the full winutils.iso. If for some
reason we can't ship VLC, we'll find another windows video capable of
doing theora, which is totally patent unencumbered, so there will be
joy and happiness for everyone :)

-- 
Lucas
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Autotest] [RFC] KVM test: Ship rss.exe and finish.exe binaries with KVM test

2010-02-04 Thread Lucas Meneghel Rodrigues
On Thu, Feb 4, 2010 at 5:26 PM, Lucas Meneghel Rodrigues l...@redhat.com 
wrote:
 And I really really want to ship the full winutils.iso. If for some
 reason we can't ship VLC, we'll find another windows video capable of
 doing theora, which is totally patent unencumbered, so there will be
 joy and happiness for everyone :)

Just in case someone is wondering, by shipping it I mean have in
some place handy for download, so get_started.py can easily download
it, *not* put it under version control.

-- 
Lucas
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/3] KVM: x86: add ioctls to get/set PIO state

2010-02-04 Thread Marcelo Tosatti
On Thu, Feb 04, 2010 at 09:16:47PM +0200, Avi Kivity wrote:
 On 01/28/2010 09:03 PM, Marcelo Tosatti wrote:
 A vcpu can be stopped after handling IO in userspace,
 but before returning to kernel to finish processing.
 
 
 Is this strictly needed?  If we teach qemu to migrate before
 executing the pio request, I think we'll be all right?  should work
 at least for IN/INS, not sure about OUT/OUTS.

It would be nice (instead of more state to keep track of between
kernel/user) but the drawbacks i see are:

You'd have to add a limitation so that any IN which was processed
by device emulation has to re-entry kernel to complete it (so it
complicates vcpu stop in userspace).

And for OUTS larger than page size (== arch-pio_data size) you need to
know the current position to continue it on the destination (or roll
back the entire effect of the instruction in device emulation, and RIP).


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] KVM: Rework of guest debug state writing

2010-02-04 Thread Marcelo Tosatti
On Thu, Feb 04, 2010 at 08:21:08PM +0100, Jan Kiszka wrote:
 Jan Kiszka wrote:
  Marcelo Tosatti wrote:
  With kvm-autotest the failure is not sporadic (and the above commit
  applied): with KVM_SET_GUEST_DEBUG in arch_put_regs all migration 
  tests fail, without, all of them succeed. 
 
  So env-kvm_guest_debug has been zeroed by cpu_x86_init, which means
  the writeback via KVM_SET_GUEST_DEBUG does almost nothing. It does
  get_rflags and set_rflags in the kernel.
  
  Hmm, it also copies debug regs around... BTW, where do we save/restore
  dr0..7 between kernel and user space?

They're not.

  But that should not be a problem, both shadow as well as effective regs
  should be properly initialized, specifically for a newly created VCPU.

Yep.

 Could you retry after pushing SET_GUEST_DEBUG at the end of
 kvm_arch_put_registers? Maybe it is no good idea to run get/set_rflags
 without having the sregs properly initialized.

Will do next week.

Another tricky thing with this is that the definition of whats the
kernel job and whats userspace job is somewhat blurry in points. For
example set_regs clears pending exceptions, which made sense in the
past, but breaks now if userspace does put_vcpu_events before set_regs 
(which is not the case with current userspace but just an example).

Makes sense to heavily document things as suggested.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH AUTOTEST] kvm: timedrift test: fix typo (host_delta_t)

2010-02-04 Thread Uri Lublin

On 02/03/2010 04:47 PM, Michael Goldish wrote:


- Uri Lublinu...@redhat.com  wrote:


Signed-off-by: Uri Lublinu...@redhat.com
---
  client/tests/kvm/tests/timedrift.py |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/client/tests/kvm/tests/timedrift.py
b/client/tests/kvm/tests/timedrift.py
index b3e8770..06f6a70 100644
--- a/client/tests/kvm/tests/timedrift.py
+++ b/client/tests/kvm/tests/timedrift.py
@@ -160,7 +160,7 @@ def run_timedrift(test, params, env):
  # Report results
  host_delta_total = ht2 - ht0
  guest_delta_total = gt2 - gt0
-drift_total = 100.0 * (host_delta_total - guest_delta_total) /
host_delta
+drift_total = 100.0 * (host_delta_total - guest_delta_total) /
host_delta_total


This isn't a typo.
delta_total is the load duration (e.g. 1 min of video decoding) +
rest duration (e.g. 20 secs of idleness).
I think the load drift and the total drift should be divided by the
same delta, in order to determine the amount of drift corrected during
idleness.  If you divide the total drift by delta_total (instead of
delta) you give a false impression that more drift was corrected than
really was.

I'm not sure I'm making my point clearly so here's an example:

Let's assume:
- The load duration is 30s;
- the idle duration is 30s;
- the drift was 10s;
- the drift was not corrected at all during idleness -- so after the
idle period the drift remained 10s.

Then:
- The load drift is 33.3% (10/30);
- if you divide by delta, the total drift is still 33.3% (10/30);
- if you divide by delta_total, the total drift is 16.6% (10/60).

So dividing by delta_total gives the impression that some drift was
corrected, when in fact none was.



O.K. That makes sense. Thanks for the explanation.

In your example, the total drift is 33% of the load duration, which is a bit 
confusing.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/3] KVM: x86: add ioctls to get/set PIO state

2010-02-04 Thread Avi Kivity

On 02/04/2010 11:36 PM, Marcelo Tosatti wrote:

On Thu, Feb 04, 2010 at 09:16:47PM +0200, Avi Kivity wrote:
   

On 01/28/2010 09:03 PM, Marcelo Tosatti wrote:
 

A vcpu can be stopped after handling IO in userspace,
but before returning to kernel to finish processing.

   

Is this strictly needed?  If we teach qemu to migrate before
executing the pio request, I think we'll be all right?  should work
at least for IN/INS, not sure about OUT/OUTS.
 

It would be nice (instead of more state to keep track of between
kernel/user) but the drawbacks i see are:

You'd have to add a limitation so that any IN which was processed
by device emulation has to re-entry kernel to complete it (so it
complicates vcpu stop in userspace).

   


You could fix that by moving the IN emulation to before guest entry.  It 
complicates the vcpu loop a bit, but is backwards compatible and all that.



And for OUTS larger than page size (== arch-pio_data size) you need to
know the current position to continue it on the destination (or roll
back the entire effect of the instruction in device emulation, and RIP).
   


What to you mean by current position?

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/3] KVM: x86: add ioctls to get/set PIO state

2010-02-04 Thread Marcelo Tosatti
On Thu, Feb 04, 2010 at 11:46:25PM +0200, Avi Kivity wrote:
 On 02/04/2010 11:36 PM, Marcelo Tosatti wrote:
 On Thu, Feb 04, 2010 at 09:16:47PM +0200, Avi Kivity wrote:
 On 01/28/2010 09:03 PM, Marcelo Tosatti wrote:
 A vcpu can be stopped after handling IO in userspace,
 but before returning to kernel to finish processing.
 
 Is this strictly needed?  If we teach qemu to migrate before
 executing the pio request, I think we'll be all right?  should work
 at least for IN/INS, not sure about OUT/OUTS.
 It would be nice (instead of more state to keep track of between
 kernel/user) but the drawbacks i see are:
 
 You'd have to add a limitation so that any IN which was processed
 by device emulation has to re-entry kernel to complete it (so it
 complicates vcpu stop in userspace).
 
 
 You could fix that by moving the IN emulation to before guest entry.
 It complicates the vcpu loop a bit, but is backwards compatible and
 all that.
 
 And for OUTS larger than page size (== arch-pio_data size) you need to
 know the current position to continue it on the destination (or roll
 back the entire effect of the instruction in device emulation, and RIP).
 
 What to you mean by current position?

outs larger than PAGE_SIZE is processed in (size / PAGE_SIZE) exits to
userspace, because thats the size of the pio_data buffer, right?
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ESX on KVM

2010-02-04 Thread Alexander Graf


Am 04.02.2010 um 18:56 schrieb Brian Kelly kelly.bri...@gmail.com:


Hello -
This is a topic which has been covered in the past that i'd like to
bump for my own sanity check...
Currently when i boot esx it hangs on loading install.tgz
booting:mbi=0x00010090, entry=0x00100212
Followed by a purple screen of death - #GP Exception(13) in world 0


The last version I tried had a grub menu entry for running a debug  
version of esx. Using -serial stdio I could then see more verbose  
output (IIRC).


But there really was a thread with debugging help and pointers to  
patches about esx on kvm. Please try to find that one.


Alex
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/3] KVM: x86: add ioctls to get/set PIO state

2010-02-04 Thread Marcelo Tosatti
On Thu, Feb 04, 2010 at 08:12:07PM -0200, Marcelo Tosatti wrote:
  On 01/28/2010 09:03 PM, Marcelo Tosatti wrote:
  A vcpu can be stopped after handling IO in userspace,
  but before returning to kernel to finish processing.
  
  Is this strictly needed?  If we teach qemu to migrate before
  executing the pio request, I think we'll be all right?  should work
  at least for IN/INS, not sure about OUT/OUTS.
  It would be nice (instead of more state to keep track of between
  kernel/user) but the drawbacks i see are:
  
  You'd have to add a limitation so that any IN which was processed
  by device emulation has to re-entry kernel to complete it (so it
  complicates vcpu stop in userspace).
  
  
  You could fix that by moving the IN emulation to before guest entry.
  It complicates the vcpu loop a bit, but is backwards compatible and
  all that.
  
  And for OUTS larger than page size (== arch-pio_data size) you need to
  know the current position to continue it on the destination (or roll
  back the entire effect of the instruction in device emulation, and RIP).
  
  What to you mean by current position?
 
 outs larger than PAGE_SIZE is processed in (size / PAGE_SIZE) exits to
 userspace, because thats the size of the pio_data buffer, right?

Nevermind, the count is in ecx which is migrated. OK i'll look into 
your suggestion.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM test: Make sure VM has an IP at post install stage

2010-02-04 Thread Lucas Meneghel Rodrigues
Similarly as we're doing on windows, make sure the vm
acquires an IP address before proceeding with the
rest of post install.

Signed-off-by: Lucas Meneghel Rodrigues l...@redhat.com
---
 client/tests/kvm/unattended/Fedora-11.ks |1 +
 client/tests/kvm/unattended/Fedora-12.ks |1 +
 client/tests/kvm/unattended/RHEL-3-series.ks |1 +
 client/tests/kvm/unattended/RHEL-4-series.ks |1 +
 client/tests/kvm/unattended/RHEL-5-series.ks |1 +
 5 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/client/tests/kvm/unattended/Fedora-11.ks 
b/client/tests/kvm/unattended/Fedora-11.ks
index ff09b22..65e42c3 100644
--- a/client/tests/kvm/unattended/Fedora-11.ks
+++ b/client/tests/kvm/unattended/Fedora-11.ks
@@ -24,6 +24,7 @@ autopart
 
 %post --interpreter /usr/bin/python
 import socket, os
+os.system('dhclient')
 os.system('chkconfig sshd on')
 os.system('iptables -F')
 os.system('echo 0  /selinux/enforce')
diff --git a/client/tests/kvm/unattended/Fedora-12.ks 
b/client/tests/kvm/unattended/Fedora-12.ks
index ff09b22..65e42c3 100644
--- a/client/tests/kvm/unattended/Fedora-12.ks
+++ b/client/tests/kvm/unattended/Fedora-12.ks
@@ -24,6 +24,7 @@ autopart
 
 %post --interpreter /usr/bin/python
 import socket, os
+os.system('dhclient')
 os.system('chkconfig sshd on')
 os.system('iptables -F')
 os.system('echo 0  /selinux/enforce')
diff --git a/client/tests/kvm/unattended/RHEL-3-series.ks 
b/client/tests/kvm/unattended/RHEL-3-series.ks
index 0adbd6f..2f2f252 100644
--- a/client/tests/kvm/unattended/RHEL-3-series.ks
+++ b/client/tests/kvm/unattended/RHEL-3-series.ks
@@ -28,6 +28,7 @@ skipx
 
 %post --interpreter /usr/bin/python
 import socket, os
+os.system('dhclient')
 os.system('chkconfig sshd on')
 os.system('iptables -F')
 port = 12323
diff --git a/client/tests/kvm/unattended/RHEL-4-series.ks 
b/client/tests/kvm/unattended/RHEL-4-series.ks
index 88dd2fd..9169b69 100644
--- a/client/tests/kvm/unattended/RHEL-4-series.ks
+++ b/client/tests/kvm/unattended/RHEL-4-series.ks
@@ -28,6 +28,7 @@ reboot
 
 %post --interpreter /usr/bin/python
 import socket, os
+os.system('dhclient')
 os.system('chkconfig sshd on')
 os.system('iptables -F')
 os.system('echo 0  /selinux/enforce')
diff --git a/client/tests/kvm/unattended/RHEL-5-series.ks 
b/client/tests/kvm/unattended/RHEL-5-series.ks
index 3c9371f..7409259 100644
--- a/client/tests/kvm/unattended/RHEL-5-series.ks
+++ b/client/tests/kvm/unattended/RHEL-5-series.ks
@@ -27,6 +27,7 @@ reboot
 
 %post --interpreter /usr/bin/python
 import socket, os
+os.system('dhclient')
 os.system('chkconfig sshd on')
 os.system('iptables -F')
 os.system('echo 0  /selinux/enforce')
-- 
1.6.6

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Passthrough PCI video capture card

2010-02-04 Thread Ameya Pandit
Hi,

On my centos x84_64 machine I have KVM installed.
kvm-83-105.el5_4.13
kmod-kvm-83-105.el5_4.13

And I have an ubuntu and a fedora 10 VMs running on it.  On the host
OS I have added below configurations video capture card.

osprey eeprom: card=89 name=Osprey 210/220/230 serial=9201206

Please find below logs for more information. I want to passthrough
this PCI card to one of the VM.

Any suggestions how to do this?

Linux video capture interface: v2.00
kernel: bttv: driver version 0.9.16 loaded
kernel: bttv: using 8 buffers with 2080k (520 pages) each for capture
kernel: bttv: Bt8xx card found (0).
kernel: GSI 23 sharing vector 0x6A and IRQ 23
kernel: ACPI: PCI Interrupt :07:04.0[A] - GSI 27 (level, low) -
IRQ 106
kernel: bttv0: Bt878 (rev 17) at :07:04.0, irq: 106, latency: 32,
mmio: 0xb8a01000
kernel: bttv0: detected: Osprey-200 [card=88], PCI subsystem ID is
0070:ff01
kernel: bttv0: using: Osprey 200/250 [card=88,autodetected]
kernel: bttv0: osprey eeprom: card=89 name=Osprey 210/220/230
serial=9201206
kernel: bttv0: using tuner=-1
kernel: bttv0: i2c: checking for TDA9887 @ 0x86... not found
kernel: bttv0: registered device video0
kernel: bttv0: registered device vbi0
kernel: bttv0: PLL: 28636363 = 35468950 .. ok


-- 
Regards,

Ameya Pandit
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How to edit *.qcow2

2010-02-04 Thread satimis

Hi Liang,


Thanks for your advice and link.

Host - Debian 5.0
KVM
libvirt



run  kvm-nbd  VM/vm30.qcow2, modprobe nbd
nbd-client localhost 1024 /dev/nbd0



then you can use /dev/nbd0 as a block device like /dev/sda



I've write a little essay on how to install Debian on kvm image, FYI:



http://blog.chinaunix.net/u/7667/showart_2112267.html



I can't find kvm-nbd and nbd-client.  I have nbd installed already.


$ yum list nbd
Loaded plugins: presto, refresh-packagekit
Available Packages
nbd.x86_642.9.13-1.fc12   fedora


$ which kvm-nbd
/usr/bin/which: no kvm-nbd in  
(/usr/lib64/qt-3.3/bin:/usr/kerberos/sbin:/usr/kerberos/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/satimis/bin)



$ which qemu-nbd
/usr/bin/qemu-nbd


$ which nbd-client
/usr/bin/which: no nbd-client in  
(/usr/lib64/qt-3.3/bin:/usr/kerberos/sbin:/usr/kerberos/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/satimis/bin)



I suppose qemu-nbd = kvm-nbd ?  Where is nbd-client ?

Any advice?  TIA


B.R.
Stephen


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Network shutdown under load

2010-02-04 Thread RW
Hi,

thanks for that! I've running a lot of hosts still running with
kernel 2.6.30 and KVM 88 without problems. It seems that
all qemu-kvm versions = 0.11.0 have this problem incl.
the latest 0.12.2. So if one of the enterprise distributions
will choose one of this versions for inclusion in there
enterprise products customers will definitley will get
problems. This is definitley a showstopper if you can't
do more then 30-50 MBit/s over some period of time.
I think kernel 2.6.32 will be choosen by a lot of distributions
but the problem still exists there as far as I've read the
mailings here.

Regards,
Robert


Cedric Peltier wrote:
 Hi,

 We encoutered similar problem yesterday by upgrading a developpement
 server from ubuntu 9.04 (kernel 2.6.28) to ubuntu 9.10 (kernel 2.6.31).
 Going back under the kernel 2.6.28 was the solution for us until now..

 Regards,


 Cédric PELTIER, société Indigo



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 12/18] KVM: PPC: Make ext giveup non-static

2010-02-04 Thread Alexander Graf
We need to call the ext giveup handlers from code outside of book3s.c.
So let's make it non-static.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_book3s.h |1 +
 arch/powerpc/kvm/book3s.c |3 +--
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index 8463976..fd43210 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -120,6 +120,7 @@ extern int kvmppc_st(struct kvm_vcpu *vcpu, ulong *eaddr, 
int size, void *ptr, b
 extern void kvmppc_book3s_queue_irqprio(struct kvm_vcpu *vcpu, unsigned int 
vec);
 extern void kvmppc_set_bat(struct kvm_vcpu *vcpu, struct kvmppc_bat *bat,
   bool upper, u32 val);
+extern void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong msr);
 
 extern u32 kvmppc_trampoline_lowmem;
 extern u32 kvmppc_trampoline_enter;
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index e8dccc6..99e9e07 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -35,7 +35,6 @@
 /* #define EXIT_DEBUG_SIMPLE */
 /* #define DEBUG_EXT */
 
-static void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong msr);
 static int kvmppc_handle_ext(struct kvm_vcpu *vcpu, unsigned int exit_nr,
 ulong msr);
 
@@ -597,7 +596,7 @@ static inline int get_fpr_index(int i)
 }
 
 /* Give up external provider (FPU, Altivec, VSX) */
-static void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong msr)
+void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong msr)
 {
struct thread_struct *t = current-thread;
u64 *vcpu_fpr = vcpu-arch.fpr;
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 18/18] KVM: PPC: Implement Paired Single emulation

2010-02-04 Thread Alexander Graf
The one big thing about the Gekko is paired singles.

Paired singles are an extension to the instruction set, that adds 32 single
precision floating point registers (qprs), some SPRs to modify the behavior
of paired singled operations and instructions to deal with qprs to the
instruction set.

Unfortunately, it also changes semantics of existing operations that affect
single values in FPRs. In most cases they get mirrored to the coresponding
QPR.

Thanks to that we need to emulate all FPU operations and all the new paired
single operations too.

In order to achieve that, we take the guest's instruction, rip out the
parameters, put in our own and execute the very same instruction, but also
fix up the QPR values along the way.

That way we can execute paired single FPU operations without implementing a
soft fpu.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_book3s.h|1 +
 arch/powerpc/kvm/Makefile|1 +
 arch/powerpc/kvm/book3s_64_emulate.c |3 +
 arch/powerpc/kvm/book3s_paired_singles.c | 1356 ++
 4 files changed, 1361 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/kvm/book3s_paired_singles.c

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index f74d1db..e32a749 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -121,6 +121,7 @@ extern void kvmppc_book3s_queue_irqprio(struct kvm_vcpu 
*vcpu, unsigned int vec)
 extern void kvmppc_set_bat(struct kvm_vcpu *vcpu, struct kvmppc_bat *bat,
   bool upper, u32 val);
 extern void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong msr);
+extern int kvmppc_emulate_paired_single(struct kvm_run *run, struct kvm_vcpu 
*vcpu);
 
 extern u32 kvmppc_trampoline_lowmem;
 extern u32 kvmppc_trampoline_enter;
diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile
index e575cfd..eba721e 100644
--- a/arch/powerpc/kvm/Makefile
+++ b/arch/powerpc/kvm/Makefile
@@ -41,6 +41,7 @@ kvm-objs-$(CONFIG_KVM_E500) := $(kvm-e500-objs)
 kvm-book3s_64-objs := \
$(common-objs-y) \
fpu.o \
+   book3s_paired_singles.o \
book3s.o \
book3s_64_emulate.o \
book3s_64_interrupts.o \
diff --git a/arch/powerpc/kvm/book3s_64_emulate.c 
b/arch/powerpc/kvm/book3s_64_emulate.c
index 1d1b952..c989214 100644
--- a/arch/powerpc/kvm/book3s_64_emulate.c
+++ b/arch/powerpc/kvm/book3s_64_emulate.c
@@ -200,6 +200,9 @@ int kvmppc_core_emulate_op(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
emulated = EMULATE_FAIL;
}
 
+   if (emulated == EMULATE_FAIL)
+   emulated = kvmppc_emulate_paired_single(run, vcpu);
+
return emulated;
 }
 
diff --git a/arch/powerpc/kvm/book3s_paired_singles.c 
b/arch/powerpc/kvm/book3s_paired_singles.c
new file mode 100644
index 000..cb258a3
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_paired_singles.c
@@ -0,0 +1,1356 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright Novell Inc 2010
+ *
+ * Authors: Alexander Graf ag...@suse.de
+ */
+
+#include asm/kvm.h
+#include asm/kvm_ppc.h
+#include asm/disassemble.h
+#include asm/kvm_book3s.h
+#include asm/kvm_fpu.h
+#include asm/reg.h
+#include asm/cacheflush.h
+#include linux/vmalloc.h
+
+/* #define DEBUG */
+
+#ifdef DEBUG
+#define dprintk printk
+#else
+#define dprintk(...) do { } while(0);
+#endif
+
+#define OP_LFS 48
+#define OP_LFSU49
+#define OP_LFD 50
+#define OP_LFDU51
+#define OP_STFS52
+#define OP_STFSU   53
+#define OP_STFD54
+#define OP_STFDU   55
+#define OP_PSQ_L   56
+#define OP_PSQ_LU  57
+#define OP_PSQ_ST  60
+#define OP_PSQ_STU 61
+
+#define OP_31_LFSX 535
+#define OP_31_LFSUX567
+#define OP_31_LFDX 599
+#define OP_31_LFDUX631
+#define OP_31_STFSX663
+#define OP_31_STFSUX   695
+#define OP_31_STFX 727
+#define OP_31_STFUX759
+#define OP_31_LWIZX887
+#define OP_31_STFIWX   983
+
+#define OP_59_FADDS21
+#define OP_59_FSUBS20
+#define OP_59_FSQRTS   22

[PATCH 16/18] KVM: PPC: Enable program interrupt to do MMIO

2010-02-04 Thread Alexander Graf
When we get a program interrupt we usually don't expect it to perform an
MMIO operation. But why not? When we emulate paired singles, we can end
up loading or storing to an MMIO address - and the handling of those
happens in the program interrupt handler.

So let's teach the program interrupt handler how to deal with EMULATE_MMIO.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 99e9e07..f842d1d 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -840,6 +840,10 @@ program_interrupt:
kvmppc_core_queue_program(vcpu, flags);
r = RESUME_GUEST;
break;
+   case EMULATE_DO_MMIO:
+   run-exit_reason = KVM_EXIT_MMIO;
+   r = RESUME_HOST_NV;
+   break;
default:
BUG();
}
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 14/18] KVM: PPC: Fix error in BAT assignment

2010-02-04 Thread Alexander Graf
BATs didn't work. Well, they did, but only up to BAT3. As soon as we
came to BAT4 the offset calculation was screwed up and we ended up
overwriting BAT0-3.

Fortunately, Linux hasn't been using BAT4+. It's still a good
idea to write correct code though.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_64_emulate.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_64_emulate.c 
b/arch/powerpc/kvm/book3s_64_emulate.c
index a93aa47..1d1b952 100644
--- a/arch/powerpc/kvm/book3s_64_emulate.c
+++ b/arch/powerpc/kvm/book3s_64_emulate.c
@@ -233,13 +233,13 @@ static void kvmppc_write_bat(struct kvm_vcpu *vcpu, int 
sprn, u32 val)
bat = vcpu_book3s-ibat[(sprn - SPRN_IBAT0U) / 2];
break;
case SPRN_IBAT4U ... SPRN_IBAT7L:
-   bat = vcpu_book3s-ibat[(sprn - SPRN_IBAT4U) / 2];
+   bat = vcpu_book3s-ibat[4 + ((sprn - SPRN_IBAT4U) / 2)];
break;
case SPRN_DBAT0U ... SPRN_DBAT3L:
bat = vcpu_book3s-dbat[(sprn - SPRN_DBAT0U) / 2];
break;
case SPRN_DBAT4U ... SPRN_DBAT7L:
-   bat = vcpu_book3s-dbat[(sprn - SPRN_DBAT4U) / 2];
+   bat = vcpu_book3s-dbat[4 + ((sprn - SPRN_DBAT4U) / 2)];
break;
default:
BUG();
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 08/18] KVM: PPC: Preload FPU when possible

2010-02-04 Thread Alexander Graf
There are some situations when we're pretty sure the guest will use the
FPU soon. So we can save the churn of going into the guest, finding out
it does want to use the FPU and going out again.

This patch adds preloading of the FPU when it's reasonable.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s.c |8 
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 6bdf7f2..07f8b42 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -137,6 +137,10 @@ void kvmppc_set_msr(struct kvm_vcpu *vcpu, u64 msr)
kvmppc_mmu_flush_segments(vcpu);
kvmppc_mmu_map_segment(vcpu, vcpu-arch.pc);
}
+
+   /* Preload FPU if it's enabled */
+   if (vcpu-arch.msr  MSR_FP)
+   kvmppc_handle_ext(vcpu, BOOK3S_INTERRUPT_FP_UNAVAIL, MSR_FP);
 }
 
 void kvmppc_inject_interrupt(struct kvm_vcpu *vcpu, int vec, u64 flags)
@@ -1194,6 +1198,10 @@ int __kvmppc_vcpu_run(struct kvm_run *kvm_run, struct 
kvm_vcpu *vcpu)
/* XXX we get called with irq disabled - change that! */
local_irq_enable();
 
+   /* Preload FPU if it's enabled */
+   if (vcpu-arch.msr  MSR_FP)
+   kvmppc_handle_ext(vcpu, BOOK3S_INTERRUPT_FP_UNAVAIL, MSR_FP);
+
ret = __kvmppc_vcpu_entry(kvm_run, vcpu);
 
local_irq_disable();
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 15/18] KVM: PPC: Add helpers to modify ppc fields

2010-02-04 Thread Alexander Graf
The PowerPC specification always lists bits from MSB to LSB. That is
really confusing when you're trying to write C code, because it fits
in pretty badly with the normal (1  xx) schemes.

So I came up with some nice wrappers that allow to get and set fields
in a u64 with bit numbers exactly as given in the spec. That makes the
code in KVM and the spec easier comparable.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_ppc.h |   33 +
 1 files changed, 33 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index 0761218..c7fcdd7 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -103,6 +103,39 @@ extern void kvmppc_booke_exit(void);
 
 extern void kvmppc_core_destroy_mmu(struct kvm_vcpu *vcpu);
 
+/*
+ * Cuts out inst bits with ordering according to spec.
+ * That means the leftmost bit is zero. All given bits are included.
+ */
+static inline u32 kvmppc_get_field(u64 inst, int msb, int lsb)
+{
+   u32 r;
+   u32 mask;
+
+   BUG_ON(msb  lsb);
+
+   mask = (1  (lsb - msb + 1)) - 1;
+   r = (inst  (63 - lsb))  mask;
+
+   return r;
+}
+
+/*
+ * Replaces inst bits with ordering according to spec.
+ */
+static inline u32 kvmppc_set_field(u64 inst, int msb, int lsb, int value)
+{
+   u32 r;
+   u32 mask;
+
+   BUG_ON(msb  lsb);
+
+   mask = ((1  (lsb - msb + 1)) - 1)  (63 - lsb);
+   r = (inst  ~mask) | ((value  (63 - lsb))  mask);
+
+   return r;
+}
+
 #ifdef CONFIG_PPC_BOOK3S
 
 /* We assume we're always acting on the current vcpu */
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 03/18] KVM: PPC: Teach MMIO Signedness

2010-02-04 Thread Alexander Graf
The guest I was trying to get to run uses the LHA and LHAU instructions.
Those instructions basically do a load, but also sign extend the result.

Since we need to fill our registers by hand when doing MMIO, we also need
to sign extend manually.

This patch implements sign extended MMIO and the LHA(U) instructions.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_host.h |1 +
 arch/powerpc/include/asm/kvm_ppc.h  |3 +++
 arch/powerpc/kvm/emulate.c  |   14 ++
 arch/powerpc/kvm/powerpc.c  |   32 
 4 files changed, 50 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 2ed954e..4dd98fa 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -268,6 +268,7 @@ struct kvm_vcpu_arch {
 
u8 io_gpr; /* GPR used as IO source/target */
u8 mmio_is_bigendian;
+   u8 mmio_sign_extend;
u8 dcr_needed;
u8 dcr_is_write;
 
diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index c011170..a288dd2 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -48,6 +48,9 @@ extern void kvmppc_dump_vcpu(struct kvm_vcpu *vcpu);
 extern int kvmppc_handle_load(struct kvm_run *run, struct kvm_vcpu *vcpu,
   unsigned int rt, unsigned int bytes,
   int is_bigendian);
+extern int kvmppc_handle_loads(struct kvm_run *run, struct kvm_vcpu *vcpu,
+   unsigned int rt, unsigned int bytes,
+   int is_bigendian);
 extern int kvmppc_handle_store(struct kvm_run *run, struct kvm_vcpu *vcpu,
u64 val, unsigned int bytes, int is_bigendian);
 
diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c
index b905623..ef2ff59 100644
--- a/arch/powerpc/kvm/emulate.c
+++ b/arch/powerpc/kvm/emulate.c
@@ -62,6 +62,8 @@
 #define OP_STBU 39
 #define OP_LHZ  40
 #define OP_LHZU 41
+#define OP_LHA  42
+#define OP_LHAU 43
 #define OP_STH  44
 #define OP_STHU 45
 
@@ -450,6 +452,18 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct 
kvm_vcpu *vcpu)
kvmppc_set_gpr(vcpu, ra, vcpu-arch.paddr_accessed);
break;
 
+   case OP_LHA:
+   rt = get_rt(inst);
+   emulated = kvmppc_handle_loads(run, vcpu, rt, 2, 1);
+   break;
+
+   case OP_LHAU:
+   ra = get_ra(inst);
+   rt = get_rt(inst);
+   emulated = kvmppc_handle_loads(run, vcpu, rt, 2, 1);
+   kvmppc_set_gpr(vcpu, ra, vcpu-arch.paddr_accessed);
+   break;
+
case OP_STH:
rs = get_rs(inst);
emulated = kvmppc_handle_store(run, vcpu,
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 98d5e6d..a235369 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -300,6 +300,25 @@ static void kvmppc_complete_mmio_load(struct kvm_vcpu 
*vcpu,
}
}
 
+   if (vcpu-arch.mmio_sign_extend) {
+   switch (run-mmio.len) {
+#ifdef CONFIG_PPC64
+   case 4:
+   if (gpr  0x8000)
+   gpr |= 0xULL;
+   break;
+#endif
+   case 2:
+   if (gpr  0x8000)
+   gpr |= 0xULL;
+   break;
+   case 1:
+   if (gpr  0x80)
+   gpr |= 0xff00ULL;
+   break;
+   }
+   }
+
kvmppc_set_gpr(vcpu, vcpu-arch.io_gpr, gpr);
 
switch (vcpu-arch.io_gpr  REG_EXT_MASK) {
@@ -337,10 +356,23 @@ int kvmppc_handle_load(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
vcpu-arch.mmio_is_bigendian = is_bigendian;
vcpu-mmio_needed = 1;
vcpu-mmio_is_write = 0;
+   vcpu-arch.mmio_sign_extend = 0;
 
return EMULATE_DO_MMIO;
 }
 
+/* Same as above, but sign extends */
+int kvmppc_handle_loads(struct kvm_run *run, struct kvm_vcpu *vcpu,
+unsigned int rt, unsigned int bytes, int is_bigendian)
+{
+   int r;
+
+   r = kvmppc_handle_load(run, vcpu, rt, bytes, is_bigendian);
+   vcpu-arch.mmio_sign_extend = 1;
+
+   return r;
+}
+
 int kvmppc_handle_store(struct kvm_run *run, struct kvm_vcpu *vcpu,
 u64 val, unsigned int bytes, int is_bigendian)
 {
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 06/18] KVM: PPC: Add Gekko SPRs

2010-02-04 Thread Alexander Graf
The Gekko has some SPR values that differ from other PPC core values and
also some additional ones.

Let's add support for them in our mfspr/mtspr emulator.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_book3s.h |1 +
 arch/powerpc/include/asm/reg.h|   10 +
 arch/powerpc/kvm/book3s_64_emulate.c  |   70 +
 3 files changed, 81 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index db7db0a..d28ee83 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -82,6 +82,7 @@ struct kvmppc_vcpu_book3s {
struct kvmppc_bat ibat[8];
struct kvmppc_bat dbat[8];
u64 hid[6];
+   u64 gqr[8];
int slb_nr;
u64 sdr1;
u64 dsisr;
diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index 5572e86..8a69a39 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -293,10 +293,12 @@
 #define HID1_ABE   (110) /* 7450 Address Broadcast Enable */
 #define HID1_PS(116) /* 750FX PLL selection */
 #define SPRN_HID2  0x3F8   /* Hardware Implementation Register 2 */
+#define SPRN_HID2_GEKKO0x398   /* Gekko HID2 Register */
 #define SPRN_IABR  0x3F2   /* Instruction Address Breakpoint Register */
 #define SPRN_IABR2 0x3FA   /* 83xx */
 #define SPRN_IBCR  0x135   /* 83xx Insn Breakpoint Control Reg */
 #define SPRN_HID4  0x3F4   /* 970 HID4 */
+#define SPRN_HID4_GEKKO0x3F3   /* Gekko HID4 */
 #define SPRN_HID5  0x3F6   /* 970 HID5 */
 #define SPRN_HID6  0x3F9   /* BE HID 6 */
 #define   HID6_LB  (0x0F12) /* Concurrent Large Page Modes */
@@ -465,6 +467,14 @@
 #define SPRN_VRSAVE0x100   /* Vector Register Save Register */
 #define SPRN_XER   0x001   /* Fixed Point Exception Register */
 
+#define SPRN_MMCR0_GEKKO 0x3B8 /* Gekko Monitor Mode Control Register 0 */
+#define SPRN_MMCR1_GEKKO 0x3BC /* Gekko Monitor Mode Control Register 1 */
+#define SPRN_PMC1_GEKKO  0x3B9 /* Gekko Performance Monitor Control 1 */
+#define SPRN_PMC2_GEKKO  0x3BA /* Gekko Performance Monitor Control 2 */
+#define SPRN_PMC3_GEKKO  0x3BD /* Gekko Performance Monitor Control 3 */
+#define SPRN_PMC4_GEKKO  0x3BE /* Gekko Performance Monitor Control 4 */
+#define SPRN_WPAR_GEKKO  0x399 /* Gekko Write Pipe Address Register */
+
 #define SPRN_SCOMC 0x114   /* SCOM Access Control */
 #define SPRN_SCOMD 0x115   /* SCOM Access DATA */
 
diff --git a/arch/powerpc/kvm/book3s_64_emulate.c 
b/arch/powerpc/kvm/book3s_64_emulate.c
index 2b0ee7e..bb4a7c1 100644
--- a/arch/powerpc/kvm/book3s_64_emulate.c
+++ b/arch/powerpc/kvm/book3s_64_emulate.c
@@ -42,6 +42,15 @@
 /* DCBZ is actually 1014, but we patch it to 1010 so we get a trap */
 #define OP_31_XOP_DCBZ 1010
 
+#define SPRN_GQR0  912
+#define SPRN_GQR1  913
+#define SPRN_GQR2  914
+#define SPRN_GQR3  915
+#define SPRN_GQR4  916
+#define SPRN_GQR5  917
+#define SPRN_GQR6  918
+#define SPRN_GQR7  919
+
 int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu,
unsigned int inst, int *advance)
 {
@@ -268,7 +277,29 @@ int kvmppc_core_emulate_mtspr(struct kvm_vcpu *vcpu, int 
sprn, int rs)
case SPRN_HID2:
to_book3s(vcpu)-hid[2] = spr_val;
break;
+   case SPRN_HID2_GEKKO:
+   to_book3s(vcpu)-hid[2] = spr_val;
+   /* HID2.PSE controls paired single on gekko */
+   switch (vcpu-arch.pvr) {
+   case 0x00080200:/* lonestar 2.0 */
+   case 0x00088202:/* lonestar 2.2 */
+   case 0x7100:/* gekko 1.0 */
+   case 0x00080100:/* gekko 2.0 */
+   case 0x00083203:/* gekko 2.3a */
+   case 0x00083213:/* gekko 2.3b */
+   case 0x00083204:/* gekko 2.4 */
+   case 0x00083214:/* gekko 2.4e (8SE) - retail HW2 */
+   if (spr_val  (1  29)) { /* HID2.PSE */
+   vcpu-arch.hflags |= BOOK3S_HFLAG_PAIRED_SINGLE;
+   kvmppc_giveup_ext(vcpu, MSR_FP);
+   } else {
+   vcpu-arch.hflags = 
~BOOK3S_HFLAG_PAIRED_SINGLE;
+   }
+   break;
+   }
+   break;
case SPRN_HID4:
+   case SPRN_HID4_GEKKO:
to_book3s(vcpu)-hid[4] = spr_val;
break;
case SPRN_HID5:
@@ -278,12 +309,30 @@ int kvmppc_core_emulate_mtspr(struct kvm_vcpu *vcpu, int 
sprn, int rs)
(mfmsr()  MSR_HV))