RE: Do I set up separate bridges for each guest?

2009-10-21 Thread Thomas Besser
Neil Aggarwal wrote:
 Dor:
 The simplest thing is to use a single bridge for all -
 The physical nic should be part of it and supply the outside world
 connection. The physical nic doesn't need an IP and the bridge should
 own it. All vms can use this bridge.
 
 I want to assign a static IP to each of the guests,
 how would I do that with a single bridge?

Whats the problem? Define the static IP in your guests and it should work.

Regards
Thomas

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Make sure get_user_desc() doesn't sign extend.

2009-10-21 Thread Chris Lalancette
The current implementation of get_user_desc() sign extends
the return value because of integer promotion rules.  For
the most part, this doesn't matter, because the top bit of
base2 is usually 0.  If, however, that bit is 1, then the
entire value will be 0x... which is probably not what
the caller intended.  This patch casts the entire thing
to unsigned before returning, which generates almost the
same assembly as the current code but replaces the final
cltq (sign extend) with a mov %eax %eax (zero-extend).
This fixes booting certain guests under KVM.

Signed-off-by: Chris Lalancette clala...@redhat.com
---
 arch/x86/include/asm/desc.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h
index e8de2f6..617bd56 100644
--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -288,7 +288,7 @@ static inline void load_LDT(mm_context_t *pc)
 
 static inline unsigned long get_desc_base(const struct desc_struct *desc)
 {
-   return desc-base0 | ((desc-base1)  16) | ((desc-base2)  24);
+   return (unsigned)(desc-base0 | ((desc-base1)  16) | ((desc-base2) 
 24));
 }
 
 static inline void set_desc_base(struct desc_struct *desc, unsigned long base)
-- 
1.6.0.6

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ANNOUNCE] Sheepdog: Distributed Storage System for KVM

2009-10-21 Thread Nikolai K. Bochev
Hello,

I am getting the following error trying to compile sheepdog on Ubuntu 9.10 ( 
2.6.31-14 x64 ) :

cd shepherd; make
make[1]: Entering directory `/home/shiny/Packages/sheepdog-2009102101/shepherd'
cc -c -g -O2 -Wall -Wstrict-prototypes -I../include -D_GNU_SOURCE shepherd.c -o 
shepherd.o
shepherd.c: In function ‘main’:
shepherd.c:300: warning: dereferencing pointer ‘hdr.55’ does break 
strict-aliasing rules
shepherd.c:300: note: initialized from here
cc -c -g -O2 -Wall -Wstrict-prototypes -I../include -D_GNU_SOURCE treeview.c -o 
treeview.o
cc -c -g -O2 -Wall -Wstrict-prototypes -I../include -D_GNU_SOURCE 
../lib/event.c -o ../lib/event.o
cc -c -g -O2 -Wall -Wstrict-prototypes -I../include -D_GNU_SOURCE ../lib/net.c 
-o ../lib/net.o
../lib/net.c: In function ‘write_object’:
../lib/net.c:358: warning: ‘vosts’ may be used uninitialized in this function
cc -c -g -O2 -Wall -Wstrict-prototypes -I../include -D_GNU_SOURCE 
../lib/logger.c -o ../lib/logger.o
cc shepherd.o treeview.o ../lib/event.o ../lib/net.o ../lib/logger.o -o 
shepherd -lncurses -lcrypto
make[1]: Leaving directory `/home/shiny/Packages/sheepdog-2009102101/shepherd'
cd sheep; make
make[1]: Entering directory `/home/shiny/Packages/sheepdog-2009102101/sheep'
cc -c -g -O2 -Wall -Wstrict-prototypes -I../include -D_GNU_SOURCE sheep.c -o 
sheep.o
cc -c -g -O2 -Wall -Wstrict-prototypes -I../include -D_GNU_SOURCE store.c -o 
store.o
cc -c -g -O2 -Wall -Wstrict-prototypes -I../include -D_GNU_SOURCE net.c -o net.o
cc -c -g -O2 -Wall -Wstrict-prototypes -I../include -D_GNU_SOURCE work.c -o 
work.o
In file included from /usr/include/asm/fcntl.h:1,
 from /usr/include/linux/fcntl.h:4,
 from /usr/include/linux/signalfd.h:13,
 from work.c:31:
/usr/include/asm-generic/fcntl.h:117: error: redefinition of ‘struct flock’
/usr/include/asm-generic/fcntl.h:140: error: redefinition of ‘struct flock64’
make[1]: *** [work.o] Error 1
make[1]: Leaving directory `/home/shiny/Packages/sheepdog-2009102101/sheep'
make: *** [all] Error 2

I have all the required libs installed. Patching and compiling qemu-kvm went 
flawless.

- Original Message -
From: MORITA Kazutaka morita.kazut...@lab.ntt.co.jp
To: kvm@vger.kernel.org, qemu-de...@nongnu.org, linux-fsde...@vger.kernel.org
Sent: Wednesday, October 21, 2009 8:13:47 AM
Subject: [ANNOUNCE] Sheepdog: Distributed Storage System for KVM

Hi everyone,

Sheepdog is a distributed storage system for KVM/QEMU. It provides
highly available block level storage volumes to VMs like Amazon EBS.
Sheepdog supports advanced volume management features such as snapshot,
cloning, and thin provisioning. Sheepdog runs on several tens or hundreds
of nodes, and the architecture is fully symmetric; there is no central
node such as a meta-data server.

The following list describes the features of Sheepdog.

 * Linear scalability in performance and capacity
 * No single point of failure
 * Redundant architecture (data is written to multiple nodes)
 - Tolerance against network failure
 * Zero configuration (newly added machines will join the cluster 
automatically)
 - Autonomous load balancing
 * Snapshot
 - Online snapshot from qemu-monitor
 * Clone from a snapshot volume
 * Thin provisioning
 - Amazon EBS API support (to use from a Eucalyptus instance)

(* = current features, - = on our todo list)

More details and download links are here:

http://www.osrg.net/sheepdog/

Note that the code is still in an early stage.
There are some critical TODO items:

 - VM image deletion support
 - Support architectures other than X86_64
 - Data recoverys
 - Free space management
 - Guarantee reliability and availability under heavy load
 - Performance improvement
 - Reclaim unused blocks
 - More documentation

We hope finding people interested in working together.
Enjoy!


Here are examples:

- create images

$ kvm-img create -f sheepdog Alice's Disk 256G
$ kvm-img create -f sheepdog Bob's Disk 256G

- list images

$ shepherd info -t vdi
4 : Alice's Disk  256 GB (allocated: 0 MB, shared: 0 MB), 2009-10-15
16:17:18, tag:0, current
8 : Bob's Disk256 GB (allocated: 0 MB, shared: 0 MB), 2009-10-15
16:29:20, tag:0, current

- start up a virtual machine

$ kvm --drive format=sheepdog,file=Alice's Disk

- create a snapshot

$ kvm-img snapshot -c name sheepdog:Alice's Disk

- clone from a snapshot

$ kvm-img create -b sheepdog:Alice's Disk:0 -f sheepdog Charlie's Disk


Thanks.

-- 
MORITA, Kazutaka

NTT Cyber Space Labs
OSS Computing Project
Kernel Group
E-mail: morita.kazut...@lab.ntt.co.jp

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More 

RE: [ANNOUNCE] Sheepdog: Distributed Storage System for KVM

2009-10-21 Thread Dietmar Maurer
Quite interesting. But would it be possible to use corosync for the cluster 
communication? The point is that we need corosync anyways for pacemaker, it is 
written in C (high performance) and seem to implement the feature you need?

 -Original Message-
 From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On
 Behalf Of MORITA Kazutaka
 Sent: Mittwoch, 21. Oktober 2009 07:14
 To: kvm@vger.kernel.org; qemu-de...@nongnu.org; linux-
 fsde...@vger.kernel.org
 Subject: [ANNOUNCE] Sheepdog: Distributed Storage System for KVM
 
 Hi everyone,
 
 Sheepdog is a distributed storage system for KVM/QEMU. It provides
 highly available block level storage volumes to VMs like Amazon EBS.
 Sheepdog supports advanced volume management features such as snapshot,
 cloning, and thin provisioning. Sheepdog runs on several tens or
 hundreds
 of nodes, and the architecture is fully symmetric; there is no central
 node such as a meta-data server.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Make sure get_user_desc() doesn't sign extend.

2009-10-21 Thread Paolo Bonzini

On 10/21/2009 09:40 AM, Chris Lalancette wrote:

The current implementation of get_user_desc() sign extends
the return value because of integer promotion rules.  For
the most part, this doesn't matter, because the top bit of
base2 is usually 0.  If, however, that bit is 1, then the
entire value will be 0x... which is probably not what
the caller intended.  This patch casts the entire thing
to unsigned before returning, which generates almost the
same assembly as the current code but replaces the final
cltq (sign extend) with a mov %eax %eax (zero-extend).
This fixes booting certain guests under KVM.


For the record, the reason why this wasn't noticed so far is that 
get_user_desc will be zero outside KVM except if used for FS and GS. 
KVM with the right guest will easily see a 0xC000 segment base, but 
you would need TLS data allocated above 2 GB to see the bug outside KVM. 
 TLS data is in the same mmap-ed memory that hosts the thread stacks, 
so it will typically be below the 2 GB mark and have its most 
significant bit cleared.


I suppose you could see the bug if you used pthread_attr_setstack, plus 
of course all the right circumstances---which are rare because all but 
the most obscure users anyway cast the result to u32.


Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Autotest] [PATCH] Test 802.1Q vlan of nic

2009-10-21 Thread Amos Kong
On Mon, Oct 19, 2009 at 10:22:21AM +0200, Dor Laor wrote:
 On 10/15/2009 11:48 AM, Amos Kong wrote:

 Test 802.1Q vlan of nic, config it by vconfig command.
1) Create two VMs
2) Setup guests in different vlan by vconfig and test communication by 
 ping
   using hard-coded ip address
3) Setup guests in same vlan and test communication by ping
4) Recover the vlan config

 Signed-off-by: Amos Kongak...@redhat.com
 ---
   client/tests/kvm/kvm_tests.cfg.sample |6 +++
   client/tests/kvm/tests/vlan_tag.py|   73 
 +
   2 files changed, 79 insertions(+), 0 deletions(-)
   mode change 100644 =  100755 client/tests/kvm/scripts/qemu-ifup

 In general the above should come as an independent patch.

yes.

   create mode 100644 client/tests/kvm/tests/vlan_tag.py

 diff --git a/client/tests/kvm/kvm_tests.cfg.sample 
 b/client/tests/kvm/kvm_tests.cfg.sample
 index 9ccc9b5..4e47767 100644
 --- a/client/tests/kvm/kvm_tests.cfg.sample
 +++ b/client/tests/kvm/kvm_tests.cfg.sample
 @@ -166,6 +166,12 @@ variants:
   used_cpus = 5
   used_mem = 2560

 +- vlan_tag:  install setup
 +type = vlan_tag
 +subnet2 = 192.168.123
 +vlans = 10 20

 If we want to be fanatic and safe we should dynamically choose subnet  
 and vlans numbers that are not used on the host instead of hard code it.

 +nic_mode = tap
 +nic_model = e1000

 Why only e1000? Let's test virtio and rtl8139 as well. Can't you inherit  
 the nic model from the config?

yes, we should remove 'nic_model = e1000' for testing all kind of nic.

It seems that there exists a kvm bug 
(https://bugzilla.redhat.com/show_bug.cgi?id=516587)
It was found by this testcase.

   - autoit:   install setup
   type = autoit
 diff --git a/client/tests/kvm/scripts/qemu-ifup 
 b/client/tests/kvm/scripts/qemu-ifup
 old mode 100644
 new mode 100755
 diff --git a/client/tests/kvm/tests/vlan_tag.py 
 b/client/tests/kvm/tests/vlan_tag.py
 new file mode 100644
 index 000..15e763f
 --- /dev/null
 +++ b/client/tests/kvm/tests/vlan_tag.py
 @@ -0,0 +1,73 @@
 +import logging, time
 +from autotest_lib.client.common_lib import error
 +import kvm_subprocess, kvm_test_utils, kvm_utils
 +
 +def run_vlan_tag(test, params, env):
 +
 +Test 802.1Q vlan of nic, config it by vconfig command.
 +
 +1) Create two VMs
 +2) Setup guests in different vlan by vconfig and test communication by 
 ping
 +   using hard-coded ip address
 +3) Setup guests in same vlan and test communication by ping
 +4) Recover the vlan config
 +
 +@param test: Kvm test object
 +@param params: Dictionary with the test parameters.
 +@param env: Dictionary with test environment.
 +
 +
 +vm = []
 +session = []
 +subnet2 = params.get(subnet2)
 +vlans = params.get(vlans).split()
 +
 +vm.append(kvm_test_utils.get_living_vm(env, %s % 
 params.get(main_vm)))
 +
 +params_vm2 = params.copy()
 +params_vm2['image_snapshot'] = yes
 +params_vm2['kill_vm_gracefully'] = no
 +params_vm2[address_index] = int(params.get(address_index, 0))+1
 +vm.append(vm[0].clone(vm2, params_vm2))
 +kvm_utils.env_register_vm(env, vm2, vm[1])
 +if not vm[1].create():
 +raise error.TestError(VM 1 create faild)


 The whole 7-8 lines above should be grouped as a function to clone  
 existing VM. It should be part of kvm autotest infrastructure.

 Besides that, it looks good.

 +
 +for i in range(2):
 +session.append(kvm_test_utils.wait_for_login(vm[i]))
 +
 +try:
 +vconfig_cmd = vconfig add eth0 %s;ifconfig eth0.%s %s.%s
 +# Attempt to configure IPs for the VMs and record the results in
 +# boolean variables
 +# Make vm1 and vm2 in the different vlan
 +
 +ip_config_vm1_ok = (session[0].get_command_status(vconfig_cmd
 +   % (vlans[0], vlans[0], subnet2, 11)) 
 == 0)
 +ip_config_vm2_ok = (session[1].get_command_status(vconfig_cmd
 +   % (vlans[1], vlans[1], subnet2, 12)) 
 == 0)
 +if not ip_config_vm1_ok or not ip_config_vm2_ok:
 +raise error.TestError, Fail to config VMs ip address
 +ping_diff_vlan_ok = (session[0].get_command_status(
 + ping -c 2 %s.12 % subnet2) == 0)
 +
 +if ping_diff_vlan_ok:
 +raise error.TestFail(VM 2 is unexpectedly pingable in 
 different 
 + vlan)
 +# Make vm2 in the same vlan with vm1
 +vlan_config_vm2_ok = (session[1].get_command_status(
 +  vconfig rem eth0.%s;vconfig add eth0 %s;
 +  ifconfig eth0.%s %s.12 %
 +  (vlans[1], vlans[0], vlans[0], subnet2)) == 0)
 +if not vlan_config_vm2_ok:
 +raise error.TestError, Fail to config ip address of VM 2
 +
 +

Re: [Autotest] [PATCH] Test 802.1Q vlan of nic

2009-10-21 Thread Amos Kong
On Tue, Oct 20, 2009 at 09:19:50AM -0400, Michael Goldish wrote:
 See comments below.

Hi all,
Thanks for your reply.
 
 - Dor Laor dl...@redhat.com wrote:
 
  On 10/15/2009 11:48 AM, Amos Kong wrote:
  
   Test 802.1Q vlan of nic, config it by vconfig command.
  1) Create two VMs
  2) Setup guests in different vlan by vconfig and test
  communication by ping
 using hard-coded ip address
  3) Setup guests in same vlan and test communication by ping
  4) Recover the vlan config
  
   Signed-off-by: Amos Kongak...@redhat.com
   ---
 client/tests/kvm/kvm_tests.cfg.sample |6 +++
 client/tests/kvm/tests/vlan_tag.py|   73
  +
 2 files changed, 79 insertions(+), 0 deletions(-)
 mode change 100644 =  100755 client/tests/kvm/scripts/qemu-ifup
  
  In general the above should come as an independent patch.
  
 create mode 100644 client/tests/kvm/tests/vlan_tag.py
  
   diff --git a/client/tests/kvm/kvm_tests.cfg.sample
  b/client/tests/kvm/kvm_tests.cfg.sample
   index 9ccc9b5..4e47767 100644
   --- a/client/tests/kvm/kvm_tests.cfg.sample
   +++ b/client/tests/kvm/kvm_tests.cfg.sample
   @@ -166,6 +166,12 @@ variants:
 used_cpus = 5
 used_mem = 2560
  
   +- vlan_tag:  install setup
   +type = vlan_tag
   +subnet2 = 192.168.123
   +vlans = 10 20
  
  If we want to be fanatic and safe we should dynamically choose subnet
  and vlans numbers that are not used on the host instead of hard code
  it.
 
 For the sake of safety maybe we should start both VMs with -snapshot.
 Dor, what do you think?  Is it safe to start 2 VMs with the same disk image
 when only one of them uses -snapshot?

Setup the second VM with -snapshot is enough. The image can only be R/W by 1th 
VM.
 
   +nic_mode = tap
   +nic_model = e1000
  
  Why only e1000? Let's test virtio and rtl8139 as well. Can't you
  inherit the nic model from the config?
 
 It's not just inherited, it's overwritten, because nic_model is defined
 later in the file in a variants block.  So this nic_model line has no
 effect.

No, this line is effective. If reserve this line, this case just test e1000, 
not the default 8139.
 
  
 - autoit:   install setup
 type = autoit
   diff --git a/client/tests/kvm/scripts/qemu-ifup
  b/client/tests/kvm/scripts/qemu-ifup
   old mode 100644
   new mode 100755
   diff --git a/client/tests/kvm/tests/vlan_tag.py
  b/client/tests/kvm/tests/vlan_tag.py
   new file mode 100644
   index 000..15e763f
   --- /dev/null
   +++ b/client/tests/kvm/tests/vlan_tag.py
   @@ -0,0 +1,73 @@
   +import logging, time
   +from autotest_lib.client.common_lib import error
   +import kvm_subprocess, kvm_test_utils, kvm_utils
   +
   +def run_vlan_tag(test, params, env):
   +
   +Test 802.1Q vlan of nic, config it by vconfig command.
   +
   +1) Create two VMs
   +2) Setup guests in different vlan by vconfig and test
  communication by ping
   +   using hard-coded ip address
   +3) Setup guests in same vlan and test communication by ping
   +4) Recover the vlan config
   +
   +@param test: Kvm test object
   +@param params: Dictionary with the test parameters.
   +@param env: Dictionary with test environment.
   +
   +
   +vm = []
   +session = []
   +subnet2 = params.get(subnet2)
   +vlans = params.get(vlans).split()
   +
   +vm.append(kvm_test_utils.get_living_vm(env, %s % 
   params.get(main_vm)))
 
 There's no need for the %s here.
 ...get_living_vm(env, params.get(main_vm))) should work.
 
   +params_vm2 = params.copy()
   +params_vm2['image_snapshot'] = yes
   +params_vm2['kill_vm_gracefully'] = no
   +params_vm2[address_index] = int(params.get(address_index, 0))+1
   +vm.append(vm[0].clone(vm2, params_vm2))
   +kvm_utils.env_register_vm(env, vm2, vm[1])
   +if not vm[1].create():
   +raise error.TestError(VM 1 create faild)
  
  
  The whole 7-8 lines above should be grouped as a function to clone 
  existing VM. It should be part of kvm autotest infrastructure.
  Besides that, it looks good.
 
 There's already a clone function and it's being used here.
 
 Instead of those 7-8 lines, why not just define the VM in the config file?
 It looks like you're always using 2 VMs so there's no reason to do this in
 test code.  This should do what you want:
 
 - vlan_tag:  install setup
 type = vlan_tag
 subnet2 = 192.168.123
 vlans = 10 20
 nic_mode = tap
 vms +=  vm2
 extra_params_vm2 +=  -snapshot
 kill_vm_gracefully_vm2 = no
 address_index_vm2 = 1
 
 The preprocessor then automatically creates vm2 and registers it in env.
 To use it in the test just do:
 
 vm.append(kvm_test_utils.get_living_vm(env, vm2))
 
 You can also use a parameter that tells the test which VM to use if you don't
 want the name vm2 hardcoded into the test.
 Add something like this 

RE: [PATCH 2/2] kvm-kmod: Document the build process

2009-10-21 Thread Dietmar Maurer
 +  Before the kvm module can be built, the linux submodule must be
 initialised
 +  and populated. The required sequence of commands is
 +
 +  git submodule init
 +  git submodule update
 +  ./configure
 +  make sync
 +  make
 +
 +  Notice that you can also specify an existing Linux tree for the
 +  synchronisation stage by using
 +
 +  make sync LINUX=/path/to/tree

I always get errors when i try to sync (with our Linux-2.6.24 tree)

./configure --kerneldir=${TOP}/linux-2.6.24-openvz
make sync LINUX=${TOP}/linux-2.6.24-openvz

make[1]: Entering directory `/a/dir/kvm-kmod-2.6.30.1'
./sync kvm-kmod-2.6.30.1
Traceback (most recent call last):
  File ./sync, line 210, in module
header_sync(arch)
  File ./sync, line 181, in header_sync
hack(T, 'x86', 'include/linux/kvm.h')
  File ./sync, line 127, in hack
_hack(T + '/' + file, arch)
  File ./sync, line 118, in _hack
data = file(fname).read()
IOError: [Errno 2] No such file or directory: 'header/include/linux/kvm.h'
make[1]: *** [sync] Error 1

Any idea whats wrong?

- Dietmar

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[no subject]

2009-10-21 Thread Junaid Arshad
subscribe kvm
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kvm88 compile errors with 2.6.31.1

2009-10-21 Thread Jorge Lucángeli Obes
On Wed, Oct 21, 2009 at 9:08 AM, Benjamin Budts mail...@gigaspeeds.be wrote:

 Hi,

 a one-off question, when I compile kvm88 with kernel 2.6.31.1 I get the
 following make errors :

 In file included from /tmp/tgz/kvm-88/kvm/kernel/x86/mmutrace.h:220,
                from /tmp/tgz/kvm-88/kvm/kernel/x86/mmu.c:184:
 include/trace/define_trace.h:53:43: error: ./mmutrace.h: No such file or
 directory
 make[4]: *** [/tmp/tgz/kvm-88/kvm/kernel/x86/mmu.o] Error 1
 make[4]: *** Waiting for unfinished jobs
 In file included from /tmp/tgz/kvm-88/kvm/kernel/x86/trace.h:355,
                from /tmp/tgz/kvm-88/kvm/kernel/x86/x86.c:83:
 include/trace/define_trace.h:53:43: error: arch/x86/kvm/trace.h: No such
 file or directory
 make[4]: *** [/tmp/tgz/kvm-88/kvm/kernel/x86/x86.o] Error 1
 make[3]: *** [/tmp/tgz/kvm-88/kvm/kernel/x86] Error 2
 make[2]: *** [_module_/tmp/tgz/kvm-88/kvm/kernel] Error 2
 make[1]: *** [all] Error 2
 make: *** [kvm-kmod] Error 2

 here's my configure :

 ./configure \
   --prefix=/usr/local/ \
   --kerneldir=/lib/modules/2.6.31.1/build


 Help would be appreciated, also tried compiling kvm87, same problems...

Hi Benjamin,

See this thread:

http://www.mail-archive.com/kvm@vger.kernel.org/msg22775.html

I believe the patches have already been applied, but there have not
been any releases since then.

Cheers,
Jorge
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Autotest] [PATCH] Test 802.1Q vlan of nic

2009-10-21 Thread Michael Goldish

- Amos Kong ak...@redhat.com wrote:

 On Tue, Oct 20, 2009 at 09:19:50AM -0400, Michael Goldish wrote:
  See comments below.
 
 Hi all,
 Thanks for your reply.
  
  - Dor Laor dl...@redhat.com wrote:
  
   On 10/15/2009 11:48 AM, Amos Kong wrote:
   
Test 802.1Q vlan of nic, config it by vconfig command.
   1) Create two VMs
   2) Setup guests in different vlan by vconfig and test
   communication by ping
  using hard-coded ip address
   3) Setup guests in same vlan and test communication by ping
   4) Recover the vlan config
   
Signed-off-by: Amos Kongak...@redhat.com
---
  client/tests/kvm/kvm_tests.cfg.sample |6 +++
  client/tests/kvm/tests/vlan_tag.py|   73
   +
  2 files changed, 79 insertions(+), 0 deletions(-)
  mode change 100644 =  100755
 client/tests/kvm/scripts/qemu-ifup
   
   In general the above should come as an independent patch.
   
  create mode 100644 client/tests/kvm/tests/vlan_tag.py
   
diff --git a/client/tests/kvm/kvm_tests.cfg.sample
   b/client/tests/kvm/kvm_tests.cfg.sample
index 9ccc9b5..4e47767 100644
--- a/client/tests/kvm/kvm_tests.cfg.sample
+++ b/client/tests/kvm/kvm_tests.cfg.sample
@@ -166,6 +166,12 @@ variants:
  used_cpus = 5
  used_mem = 2560
   
+- vlan_tag:  install setup
+type = vlan_tag
+subnet2 = 192.168.123
+vlans = 10 20
   
   If we want to be fanatic and safe we should dynamically choose
 subnet
   and vlans numbers that are not used on the host instead of hard
 code
   it.
  
  For the sake of safety maybe we should start both VMs with
 -snapshot.
  Dor, what do you think?  Is it safe to start 2 VMs with the same
 disk image
  when only one of them uses -snapshot?
 
 Setup the second VM with -snapshot is enough. The image can only be
 R/W by 1th VM.
  
+nic_mode = tap
+nic_model = e1000
   
   Why only e1000? Let's test virtio and rtl8139 as well. Can't you
   inherit the nic model from the config?
  
  It's not just inherited, it's overwritten, because nic_model is
 defined
  later in the file in a variants block.  So this nic_model line has
 no
  effect.
 
 No, this line is effective. If reserve this line, this case just test
 e1000, not the default 8139.

OK, you're right in the case of the default rtl8139 variant.
Still, we may want to test it sometimes, and if we want e1000 we can use
only e1000 instead of only rtl8139, so I don't think this line is
necessary.

   
  - autoit:   install setup
  type = autoit
diff --git a/client/tests/kvm/scripts/qemu-ifup
   b/client/tests/kvm/scripts/qemu-ifup
old mode 100644
new mode 100755
diff --git a/client/tests/kvm/tests/vlan_tag.py
   b/client/tests/kvm/tests/vlan_tag.py
new file mode 100644
index 000..15e763f
--- /dev/null
+++ b/client/tests/kvm/tests/vlan_tag.py
@@ -0,0 +1,73 @@
+import logging, time
+from autotest_lib.client.common_lib import error
+import kvm_subprocess, kvm_test_utils, kvm_utils
+
+def run_vlan_tag(test, params, env):
+
+Test 802.1Q vlan of nic, config it by vconfig command.
+
+1) Create two VMs
+2) Setup guests in different vlan by vconfig and test
   communication by ping
+   using hard-coded ip address
+3) Setup guests in same vlan and test communication by
 ping
+4) Recover the vlan config
+
+@param test: Kvm test object
+@param params: Dictionary with the test parameters.
+@param env: Dictionary with test environment.
+
+
+vm = []
+session = []
+subnet2 = params.get(subnet2)
+vlans = params.get(vlans).split()
+
+vm.append(kvm_test_utils.get_living_vm(env, %s %
 params.get(main_vm)))
  
  There's no need for the %s here.
  ...get_living_vm(env, params.get(main_vm))) should work.
  
+params_vm2 = params.copy()
+params_vm2['image_snapshot'] = yes
+params_vm2['kill_vm_gracefully'] = no
+params_vm2[address_index] =
 int(params.get(address_index, 0))+1
+vm.append(vm[0].clone(vm2, params_vm2))
+kvm_utils.env_register_vm(env, vm2, vm[1])
+if not vm[1].create():
+raise error.TestError(VM 1 create faild)
   
   
   The whole 7-8 lines above should be grouped as a function to clone
 
   existing VM. It should be part of kvm autotest infrastructure.
   Besides that, it looks good.
  
  There's already a clone function and it's being used here.
  
  Instead of those 7-8 lines, why not just define the VM in the config
 file?
  It looks like you're always using 2 VMs so there's no reason to do
 this in
  test code.  This should do what you want:
  
  - vlan_tag:  install setup
  type = vlan_tag
  subnet2 = 192.168.123
  vlans = 10 20
  nic_mode = tap
  vms +=  vm2
  

Re: kvm88 compile errors with 2.6.31.1

2009-10-21 Thread Michael Tokarev

Jorge Lucángeli Obes wrote:
[]

See this thread:

http://www.mail-archive.com/kvm@vger.kernel.org/msg22775.html

I believe the patches have already been applied, but there have not
been any releases since then.


qemu-kvm-0.11.0 is out for a long time.

/mjt
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/5] Nested VMX patch 3 implements vmptrld and vmptrst

2009-10-21 Thread Orit Wasserman


Gleb Natapov g...@redhat.com wrote on 19/10/2009 13:17:41:

 From:

 Gleb Natapov g...@redhat.com

 To:

 Orit Wasserman/Haifa/i...@ibmil

 Cc:

 kvm@vger.kernel.org, Ben-Ami Yassour1/Haifa/i...@ibmil, Abel Gordon/
 Haifa/i...@ibmil, Muli Ben-Yehuda/Haifa/i...@ibmil,
 aligu...@us.ibm.com, md...@us.ibm.com

 Date:

 19/10/2009 13:17

 Subject:

 Re: [PATCH 3/5] Nested VMX patch 3 implements vmptrld and vmptrst

 On Thu, Oct 15, 2009 at 04:41:44PM +0200, or...@il.ibm.com wrote:
  From: Orit Wasserman or...@il.ibm.com
 
  ---
   arch/x86/kvm/vmx.c |  468 +++
 +++--
   arch/x86/kvm/x86.c |3 +-
   2 files changed, 459 insertions(+), 12 deletions(-)
 
  diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
  index 411cbdb..8c186e0 100644
  --- a/arch/x86/kvm/vmx.c
  +++ b/arch/x86/kvm/vmx.c
  @@ -61,20 +61,168 @@ module_param_named(unrestricted_guest,
   static int __read_mostly emulate_invalid_guest_state = 0;
   module_param(emulate_invalid_guest_state, bool, S_IRUGO);
 
  +
  +struct __attribute__ ((__packed__)) shadow_vmcs {
  +   u32 revision_id;
  +   u32 abort;
  +   u16 virtual_processor_id;
  +   u16 guest_es_selector;
  +   u16 guest_cs_selector;
  +   u16 guest_ss_selector;
  +   u16 guest_ds_selector;
  +   u16 guest_fs_selector;
  +   u16 guest_gs_selector;
  +   u16 guest_ldtr_selector;
  +   u16 guest_tr_selector;
  +   u16 host_es_selector;
  +   u16 host_cs_selector;
  +   u16 host_ss_selector;
  +   u16 host_ds_selector;
  +   u16 host_fs_selector;
  +   u16 host_gs_selector;
  +   u16 host_tr_selector;
  +   u64 io_bitmap_a;
  +   u64 io_bitmap_b;
  +   u64 msr_bitmap;
  +   u64 vm_exit_msr_store_addr;
  +   u64 vm_exit_msr_load_addr;
  +   u64 vm_entry_msr_load_addr;
  +   u64 tsc_offset;
  +   u64 virtual_apic_page_addr;
  +   u64 apic_access_addr;
  +   u64 ept_pointer;
  +   u64 guest_physical_address;
  +   u64 vmcs_link_pointer;
  +   u64 guest_ia32_debugctl;
  +   u64 guest_ia32_pat;
  +   u64 guest_pdptr0;
  +   u64 guest_pdptr1;
  +   u64 guest_pdptr2;
  +   u64 guest_pdptr3;
  +   u64 host_ia32_pat;
  +   u32 pin_based_vm_exec_control;
  +   u32 cpu_based_vm_exec_control;
  +   u32 exception_bitmap;
  +   u32 page_fault_error_code_mask;
  +   u32 page_fault_error_code_match;
  +   u32 cr3_target_count;
  +   u32 vm_exit_controls;
  +   u32 vm_exit_msr_store_count;
  +   u32 vm_exit_msr_load_count;
  +   u32 vm_entry_controls;
  +   u32 vm_entry_msr_load_count;
  +   u32 vm_entry_intr_info_field;
  +   u32 vm_entry_exception_error_code;
  +   u32 vm_entry_instruction_len;
  +   u32 tpr_threshold;
  +   u32 secondary_vm_exec_control;
  +   u32 vm_instruction_error;
  +   u32 vm_exit_reason;
  +   u32 vm_exit_intr_info;
  +   u32 vm_exit_intr_error_code;
  +   u32 idt_vectoring_info_field;
  +   u32 idt_vectoring_error_code;
  +   u32 vm_exit_instruction_len;
  +   u32 vmx_instruction_info;
  +   u32 guest_es_limit;
  +   u32 guest_cs_limit;
  +   u32 guest_ss_limit;
  +   u32 guest_ds_limit;
  +   u32 guest_fs_limit;
  +   u32 guest_gs_limit;
  +   u32 guest_ldtr_limit;
  +   u32 guest_tr_limit;
  +   u32 guest_gdtr_limit;
  +   u32 guest_idtr_limit;
  +   u32 guest_es_ar_bytes;
  +   u32 guest_cs_ar_bytes;
  +   u32 guest_ss_ar_bytes;
  +   u32 guest_ds_ar_bytes;
  +   u32 guest_fs_ar_bytes;
  +   u32 guest_gs_ar_bytes;
  +   u32 guest_ldtr_ar_bytes;
  +   u32 guest_tr_ar_bytes;
  +   u32 guest_interruptibility_info;
  +   u32 guest_activity_state;
  +   u32 guest_sysenter_cs;
  +   u32 host_ia32_sysenter_cs;
  +   unsigned long cr0_guest_host_mask;
  +   unsigned long cr4_guest_host_mask;
  +   unsigned long cr0_read_shadow;
  +   unsigned long cr4_read_shadow;
  +   unsigned long cr3_target_value0;
  +   unsigned long cr3_target_value1;
  +   unsigned long cr3_target_value2;
  +   unsigned long cr3_target_value3;
  +   unsigned long exit_qualification;
  +   unsigned long guest_linear_address;
  +   unsigned long guest_cr0;
  +   unsigned long guest_cr3;
  +   unsigned long guest_cr4;
  +   unsigned long guest_es_base;
  +   unsigned long guest_cs_base;
  +   unsigned long guest_ss_base;
  +   unsigned long guest_ds_base;
  +   unsigned long guest_fs_base;
  +   unsigned long guest_gs_base;
  +   unsigned long guest_ldtr_base;
  +   unsigned long guest_tr_base;
  +   unsigned long guest_gdtr_base;
  +   unsigned long guest_idtr_base;
  +   unsigned long guest_dr7;
  +   unsigned long guest_rsp;
  +   unsigned long guest_rip;
  +   unsigned long guest_rflags;
  +   unsigned long guest_pending_dbg_exceptions;
  +   unsigned long guest_sysenter_esp;
  +   unsigned long guest_sysenter_eip;
  +   unsigned long host_cr0;
  +   unsigned long host_cr3;
  +   unsigned long host_cr4;
  +   unsigned long host_fs_base;
  +   unsigned long host_gs_base;
  +   unsigned long host_tr_base;
  +   unsigned long host_gdtr_base;
  +   unsigned long host_idtr_base;
  +   unsigned long host_ia32_sysenter_esp;
  +   unsigned long host_ia32_sysenter_eip;
  

Re: [PATCH 3/5] Nested VMX patch 3 implements vmptrld and vmptrst

2009-10-21 Thread Orit Wasserman


Gleb Natapov g...@redhat.com wrote on 19/10/2009 14:59:53:

 From:

 Gleb Natapov g...@redhat.com

 To:

 Orit Wasserman/Haifa/i...@ibmil

 Cc:

 kvm@vger.kernel.org, Ben-Ami Yassour1/Haifa/i...@ibmil, Abel Gordon/
 Haifa/i...@ibmil, Muli Ben-Yehuda/Haifa/i...@ibmil,
 aligu...@us.ibm.com, md...@us.ibm.com

 Date:

 19/10/2009 15:00

 Subject:

 Re: [PATCH 3/5] Nested VMX patch 3 implements vmptrld and vmptrst

 On Thu, Oct 15, 2009 at 04:41:44PM +0200, or...@il.ibm.com wrote:
  +static struct page *nested_get_page(struct kvm_vcpu *vcpu,
  +u64 vmcs_addr)
  +{
  +   struct page *vmcs_page = NULL;
  +
  +   down_read(current-mm-mmap_sem);
  +   vmcs_page = gfn_to_page(vcpu-kvm, vmcs_addr  PAGE_SHIFT);
  +   up_read(current-mm-mmap_sem);
 Why are you taking mmap_sem here? gup_fast() takes it if required.
I will remove it.

  +
  +   if (is_error_page(vmcs_page)) {
  +  printk(KERN_ERR %s error allocating page \n, __func__);
  +  kvm_release_page_clean(vmcs_page);
  +  return NULL;
  +   }
  +
  +   return vmcs_page;
  +
  +}
  +

 --
  Gleb.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/5] Nested VMX patch 4 implements vmread and vmwrite

2009-10-21 Thread Orit Wasserman


Gleb Natapov g...@redhat.com wrote on 19/10/2009 15:17:20:

 From:

 Gleb Natapov g...@redhat.com

 To:

 Orit Wasserman/Haifa/i...@ibmil

 Cc:

 kvm@vger.kernel.org, Ben-Ami Yassour1/Haifa/i...@ibmil, Abel Gordon/
 Haifa/i...@ibmil, Muli Ben-Yehuda/Haifa/i...@ibmil,
 aligu...@us.ibm.com, md...@us.ibm.com

 Date:

 19/10/2009 15:17

 Subject:

 Re: [PATCH 4/5] Nested VMX patch 4 implements vmread and vmwrite

 On Thu, Oct 15, 2009 at 04:41:45PM +0200, or...@il.ibm.com wrote:
  From: Orit Wasserman or...@il.ibm.com
 
  ---
   arch/x86/kvm/vmx.c |  591 +++
 -
   1 files changed, 589 insertions(+), 2 deletions(-)
 
  diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
  index 8c186e0..6a4c252 100644
  --- a/arch/x86/kvm/vmx.c
  +++ b/arch/x86/kvm/vmx.c
  @@ -225,6 +225,21 @@ struct nested_vmx {
  struct level_state *l1_state;
   };
 
  +enum vmcs_field_type {
  +   VMCS_FIELD_TYPE_U16 = 0,
  +   VMCS_FIELD_TYPE_U64 = 1,
  +   VMCS_FIELD_TYPE_U32 = 2,
  +   VMCS_FIELD_TYPE_ULONG = 3
  +};
  +
  +#define VMCS_FIELD_LENGTH_OFFSET 13
  +#define VMCS_FIELD_LENGTH_MASK 0x6000
  +
  +static inline int vmcs_field_length(unsigned long field)
  +{
  +   return (VMCS_FIELD_LENGTH_MASK  field)  13;
  +}
  +
   struct vmcs {
  u32 revision_id;
  u32 abort;
  @@ -288,6 +303,404 @@ static inline struct vcpu_vmx *to_vmx(struct
 kvm_vcpu *vcpu)
  return container_of(vcpu, struct vcpu_vmx, vcpu);
   }
 
  +#define SHADOW_VMCS_OFFSET(x) offsetof(struct shadow_vmcs, x)
  +
  +static unsigned short vmcs_field_to_offset_table[HOST_RIP+1] = {
  +
  +   [VIRTUAL_PROCESSOR_ID] =
  +  SHADOW_VMCS_OFFSET(virtual_processor_id),
  +   [GUEST_ES_SELECTOR] =
  +  SHADOW_VMCS_OFFSET(guest_es_selector),
  +   [GUEST_CS_SELECTOR] =
  +  SHADOW_VMCS_OFFSET(guest_cs_selector),
  +   [GUEST_SS_SELECTOR] =
  +  SHADOW_VMCS_OFFSET(guest_ss_selector),
  +   [GUEST_DS_SELECTOR] =
  +  SHADOW_VMCS_OFFSET(guest_ds_selector),
  +   [GUEST_FS_SELECTOR] =
  +  SHADOW_VMCS_OFFSET(guest_fs_selector),
  +   [GUEST_GS_SELECTOR] =
  +  SHADOW_VMCS_OFFSET(guest_gs_selector),
  +   [GUEST_LDTR_SELECTOR] =
  +  SHADOW_VMCS_OFFSET(guest_ldtr_selector),
  +   [GUEST_TR_SELECTOR] =
  +  SHADOW_VMCS_OFFSET(guest_tr_selector),
  +   [HOST_ES_SELECTOR] =
  +  SHADOW_VMCS_OFFSET(host_es_selector),
  +   [HOST_CS_SELECTOR] =
  +  SHADOW_VMCS_OFFSET(host_cs_selector),
  +   [HOST_SS_SELECTOR] =
  +  SHADOW_VMCS_OFFSET(host_ss_selector),
  +   [HOST_DS_SELECTOR] =
  +  SHADOW_VMCS_OFFSET(host_ds_selector),
  +   [HOST_FS_SELECTOR] =
  +  SHADOW_VMCS_OFFSET(host_fs_selector),
  +   [HOST_GS_SELECTOR] =
  +  SHADOW_VMCS_OFFSET(host_gs_selector),
  +   [HOST_TR_SELECTOR] =
  +  SHADOW_VMCS_OFFSET(host_tr_selector),
  +   [IO_BITMAP_A] =
  +  SHADOW_VMCS_OFFSET(io_bitmap_a),
  +   [IO_BITMAP_A_HIGH] =
  +  SHADOW_VMCS_OFFSET(io_bitmap_a)+4,
  +   [IO_BITMAP_B] =
  +  SHADOW_VMCS_OFFSET(io_bitmap_b),
  +   [IO_BITMAP_B_HIGH] =
  +  SHADOW_VMCS_OFFSET(io_bitmap_b)+4,
  +   [MSR_BITMAP] =
  +  SHADOW_VMCS_OFFSET(msr_bitmap),
  +   [MSR_BITMAP_HIGH] =
  +  SHADOW_VMCS_OFFSET(msr_bitmap)+4,
  +   [VM_EXIT_MSR_STORE_ADDR] =
  +  SHADOW_VMCS_OFFSET(vm_exit_msr_store_addr),
  +   [VM_EXIT_MSR_STORE_ADDR_HIGH] =
  +  SHADOW_VMCS_OFFSET(vm_exit_msr_store_addr)+4,
  +   [VM_EXIT_MSR_LOAD_ADDR] =
  +  SHADOW_VMCS_OFFSET(vm_exit_msr_load_addr),
  +   [VM_EXIT_MSR_LOAD_ADDR_HIGH] =
  +  SHADOW_VMCS_OFFSET(vm_exit_msr_load_addr)+4,
  +   [VM_ENTRY_MSR_LOAD_ADDR] =
  +  SHADOW_VMCS_OFFSET(vm_entry_msr_load_addr),
  +   [VM_ENTRY_MSR_LOAD_ADDR_HIGH] =
  +  SHADOW_VMCS_OFFSET(vm_entry_msr_load_addr)+4,
  +   [TSC_OFFSET] =
  +  SHADOW_VMCS_OFFSET(tsc_offset),
  +   [TSC_OFFSET_HIGH] =
  +  SHADOW_VMCS_OFFSET(tsc_offset)+4,
  +   [VIRTUAL_APIC_PAGE_ADDR] =
  +  SHADOW_VMCS_OFFSET(virtual_apic_page_addr),
  +   [VIRTUAL_APIC_PAGE_ADDR_HIGH] =
  +  SHADOW_VMCS_OFFSET(virtual_apic_page_addr)+4,
  +   [APIC_ACCESS_ADDR] =
  +  SHADOW_VMCS_OFFSET(apic_access_addr),
  +   [APIC_ACCESS_ADDR_HIGH] =
  +  SHADOW_VMCS_OFFSET(apic_access_addr)+4,
  +   [EPT_POINTER] =
  +  SHADOW_VMCS_OFFSET(ept_pointer),
  +   [EPT_POINTER_HIGH] =
  +  SHADOW_VMCS_OFFSET(ept_pointer)+4,
  +   [GUEST_PHYSICAL_ADDRESS] =
  +  SHADOW_VMCS_OFFSET(guest_physical_address),
  +   [GUEST_PHYSICAL_ADDRESS_HIGH] =
  +  SHADOW_VMCS_OFFSET(guest_physical_address)+4,
  +   [VMCS_LINK_POINTER] =
  +  SHADOW_VMCS_OFFSET(vmcs_link_pointer),
  +   [VMCS_LINK_POINTER_HIGH] =
  +  SHADOW_VMCS_OFFSET(vmcs_link_pointer)+4,
  +   [GUEST_IA32_DEBUGCTL] =
  +  SHADOW_VMCS_OFFSET(guest_ia32_debugctl),
  +   [GUEST_IA32_DEBUGCTL_HIGH] =
  +  SHADOW_VMCS_OFFSET(guest_ia32_debugctl)+4,
  +   [GUEST_IA32_PAT] =
  +  SHADOW_VMCS_OFFSET(guest_ia32_pat),
  +   [GUEST_IA32_PAT_HIGH] =
  +  

Re: [Autotest] [PATCH] Test 802.1Q vlan of nic

2009-10-21 Thread Uri Lublin

On 10/21/2009 12:37 PM, Amos Kong wrote:

On Tue, Oct 20, 2009 at 09:19:50AM -0400, Michael Goldish wrote:

- Dor Laordl...@redhat.com  wrote:

On 10/15/2009 11:48 AM, Amos Kong wrote:

For the sake of safety maybe we should start both VMs with -snapshot.
Dor, what do you think?  Is it safe to start 2 VMs with the same disk image
when only one of them uses -snapshot?


Setup the second VM with -snapshot is enough. The image can only be R/W by 1th 
VM.



Actually, I agree with Michael. If both VMs use the same disk image, it is safer 
to setup both VMs with -snapshot. When the first VM writes to the disk-image 
the second VM may be affected.




+nic_mode = tap
+nic_model = e1000


Why only e1000? Let's test virtio and rtl8139 as well. Can't you
inherit the nic model from the config?


It's not just inherited, it's overwritten, because nic_model is defined
later in the file in a variants block.  So this nic_model line has no
effect.


No, this line is effective. If reserve this line, this case just test e1000, 
not the default 8139.


It is overwritten for virtio and untouched for rtl8139 (BTW, we need to add 
rtl8139 definition instead of leaving it empty and use qemu-kvm default nic).


If you really want to only test e1000, a filter is more appropriate.

Regards,
Uri.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Enable 32bit dirty log pointers on 64bit host

2009-10-21 Thread Alexander Graf
From: Arnd Bergmann a...@arndb.de

With big endian userspace, we can't quite figure out if a pointer
is 32 bit (shifted  32) or 64 bit when we read a 64 bit pointer.

This is what happens with dirty logging. To get the pointer interpreted
correctly, we thus need Arnd's patch to implement a compat layer for
the ioctl:

A better way to do this is to add a separate compat_ioctl() method that
converts this for you.

From: Arnd Bergmann a...@arndb.de
Signed-off-by: Arnd Bergmann a...@arndb.de
Acked-by: Alexander Graf ag...@suse.de

---

Changes from Arnd's example version:

  - s/log.log/log/ (Avi)
  - use sizeof(compat_log) (Avi)
  - compile fixes
---
 virt/kvm/kvm_main.c |   49 -
 1 files changed, 48 insertions(+), 1 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index cac69c4..54a272f 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -43,6 +43,7 @@
 #include linux/swap.h
 #include linux/bitops.h
 #include linux/spinlock.h
+#include linux/compat.h
 
 #include asm/processor.h
 #include asm/io.h
@@ -1542,6 +1543,52 @@ out:
return r;
 }
 
+#ifdef CONFIG_COMPAT
+struct compat_kvm_dirty_log {
+   __u32 slot;
+   __u32 padding1;
+   union {
+   compat_uptr_t dirty_bitmap; /* one bit per page */
+   __u64 padding2;
+   };
+};
+
+static long kvm_vm_compat_ioctl(struct file *filp,
+  unsigned int ioctl, unsigned long arg)
+{
+   struct kvm *kvm = filp-private_data;
+   int r;
+
+   if (kvm-mm != current-mm)
+   return -EIO;
+   switch (ioctl) {
+   case KVM_GET_DIRTY_LOG: {
+   struct compat_kvm_dirty_log compat_log;
+   struct kvm_dirty_log log;
+
+   r = -EFAULT;
+   if (copy_from_user(compat_log, (void __user *)arg,
+  sizeof(compat_log)))
+   goto out;
+   log.slot = compat_log.slot;
+   log.padding1 = compat_log.padding1;
+   log.padding2 = compat_log.padding2;
+   log.dirty_bitmap = compat_ptr(compat_log.dirty_bitmap);
+
+   r = kvm_vm_ioctl_get_dirty_log(kvm, log);
+   if (r)
+   goto out;
+   break;
+   }
+   default:
+   r = kvm_vm_ioctl(filp, ioctl, arg);
+   }
+
+out:
+   return r;
+}
+#endif
+
 static int kvm_vm_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 {
struct page *page[1];
@@ -1576,7 +1623,7 @@ static int kvm_vm_mmap(struct file *file, struct 
vm_area_struct *vma)
 static struct file_operations kvm_vm_fops = {
.release= kvm_vm_release,
.unlocked_ioctl = kvm_vm_ioctl,
-   .compat_ioctl   = kvm_vm_ioctl,
+   .compat_ioctl   = kvm_vm_compat_ioctl,
.mmap   = kvm_vm_mmap,
 };
 
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] kvm-kmod: Document the build process

2009-10-21 Thread Wolfgang Mauerer
Hio,

Dietmar Maurer wrote:
 +  Before the kvm module can be built, the linux submodule must be
 initialised
 +  and populated. The required sequence of commands is
 +
 +  git submodule init
 +  git submodule update
 +  ./configure
 +  make sync
 +  make
 +
 +  Notice that you can also specify an existing Linux tree for the
 +  synchronisation stage by using
 +
 +  make sync LINUX=/path/to/tree
 
 I always get errors when i try to sync (with our Linux-2.6.24 tree)
 
 ./configure --kerneldir=${TOP}/linux-2.6.24-openvz
 make sync LINUX=${TOP}/linux-2.6.24-openvz
 
 make[1]: Entering directory `/a/dir/kvm-kmod-2.6.30.1'
 ./sync kvm-kmod-2.6.30.1
 Traceback (most recent call last):
   File ./sync, line 210, in module
 header_sync(arch)
   File ./sync, line 181, in header_sync
 hack(T, 'x86', 'include/linux/kvm.h')
   File ./sync, line 127, in hack
 _hack(T + '/' + file, arch)
   File ./sync, line 118, in _hack
 data = file(fname).read()
 IOError: [Errno 2] No such file or directory: 'header/include/linux/kvm.h'
 make[1]: *** [sync] Error 1
 
 Any idea whats wrong?

Do you have a checked out Linux tree in linux-2.6/?
Currently, the linux-2.6 submodule seems to reference
a commit that is not present in the standard kvm
repo, so I assume that git submodule update
has failed for you. For now, you can fix this by
using git checkout in linux-2.6/

Btw.: Wrt. git sync, I've also updated to README
a bit to the two different kernel trees that come into
play. Comments?

Jan: I think the bogous reference is a leftover
from the tree that you've added as submodule; after
my patch to use Avi's tree, the information might be
out of sync.

Best, Wolfgang
diff --git a/README b/README
index 40a72d3..34cc51a 100644
--- a/README
+++ b/README
@@ -1,16 +1,21 @@
 Building the KVM kernel module is performed differently depending on whether
 you are working from a clone of the git repository or from a source release.
+Notice that two kernels are involved: One from which the KVM sources
+are taken (kernel A), and one for which the module is built (kernel B). 
+For out-of-tree module builds, it is well possible that kernel A is more
+recent than kernel B.
 
 - To build from a release, simply use ./configure (possibly with any
   arguments that are required for your setup, see ./configure --help)
-  and make.
+  and make. The kernel specified with --kerneldir refers to kernel B,
+  that is, the kernel for which the module is built.
 
 - Building from a cloned git repository requires a kernel tree with the main
-  kvm sources that is included as a submodule in the linux-2.6/ directory.  By
-  default, the KVM development tree on git.kernel.org is used, but you can
-  change this setting in .gitmodules
+  kvm sources (kernel A) that is included as a submodule in the linux-2.6/
+  directory.  By default, the KVM development tree on git.kernel.org is used,
+  but you can change this setting in .gitmodules
 
-  Before the kvm module can be built, the linux submodule must be initialised 
+  Before the kvm module can be built, the linux submodule must be initialised
   and populated. The required sequence of commands is
 
   git submodule init
@@ -24,3 +29,5 @@ you are working from a clone of the git repository or from a source release.
 
   make sync LINUX=/path/to/tree
 
+  The synchronisation stage refers to kernel A, that is, the kernel
+  from which the KVM sources are taken.


[KVM PATCH 0/2] irqfd enhancements

2009-10-21 Thread Gregory Haskins
(Applies to kvm.git/master:11b06403)

The following patches are cleanups/enhancements for IRQFD now that
we have lockless interrupt injection.  For more details, please see
the patch headers.

These patches pass checkpatch, and are fully tested.  Please consider
for merging.  They are an enhancement only, so there is no urgency
to push to mainline until a suitable merge window presents itself.

Kind Regards,
-Greg

---

Gregory Haskins (2):
  KVM: Remove unecessary irqfd-cleanup-wq
  KVM: Directly inject interrupts via irqfd


 virt/kvm/eventfd.c |   45 ++---
 1 files changed, 6 insertions(+), 39 deletions(-)

-- 
Signature
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[KVM PATCH 1/2] KVM: Directly inject interrupts via irqfd

2009-10-21 Thread Gregory Haskins
IRQFD currently uses a deferred workqueue item to execute the injection
operation.  It was originally designed this way because kvm_set_irq()
required the caller to hold the irq_lock mutex, and the eventfd callback
is invoked from within a non-preemptible critical section.

With the advent of lockless injection support in kvm_set_irq, the deferment
mechanism is no longer technically needed. Since context switching to the
workqueue is a source of interrupt latency, lets switch to a direct
method.

Signed-off-by: Gregory Haskins ghask...@novell.com
---

 virt/kvm/eventfd.c |   15 +++
 1 files changed, 3 insertions(+), 12 deletions(-)

diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
index 30f70fd..1a529d4 100644
--- a/virt/kvm/eventfd.c
+++ b/virt/kvm/eventfd.c
@@ -49,16 +49,14 @@ struct _irqfd {
poll_tablept;
wait_queue_head_t*wqh;
wait_queue_t  wait;
-   struct work_structinject;
struct work_structshutdown;
 };
 
 static struct workqueue_struct *irqfd_cleanup_wq;
 
 static void
-irqfd_inject(struct work_struct *work)
+irqfd_inject(struct _irqfd *irqfd)
 {
-   struct _irqfd *irqfd = container_of(work, struct _irqfd, inject);
struct kvm *kvm = irqfd-kvm;
 
kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID, irqfd-gsi, 1);
@@ -80,12 +78,6 @@ irqfd_shutdown(struct work_struct *work)
remove_wait_queue(irqfd-wqh, irqfd-wait);
 
/*
-* We know no new events will be scheduled at this point, so block
-* until all previously outstanding events have completed
-*/
-   flush_work(irqfd-inject);
-
-   /*
 * It is now safe to release the object's resources
 */
eventfd_ctx_put(irqfd-eventfd);
@@ -126,7 +118,7 @@ irqfd_wakeup(wait_queue_t *wait, unsigned mode, int sync, 
void *key)
 
if (flags  POLLIN)
/* An event has been signaled, inject an interrupt */
-   schedule_work(irqfd-inject);
+   irqfd_inject(irqfd);
 
if (flags  POLLHUP) {
/* The eventfd is closing, detach from KVM */
@@ -179,7 +171,6 @@ kvm_irqfd_assign(struct kvm *kvm, int fd, int gsi)
irqfd-kvm = kvm;
irqfd-gsi = gsi;
INIT_LIST_HEAD(irqfd-list);
-   INIT_WORK(irqfd-inject, irqfd_inject);
INIT_WORK(irqfd-shutdown, irqfd_shutdown);
 
file = eventfd_fget(fd);
@@ -214,7 +205,7 @@ kvm_irqfd_assign(struct kvm *kvm, int fd, int gsi)
 * before we registered, and trigger it as if we didn't miss it.
 */
if (events  POLLIN)
-   schedule_work(irqfd-inject);
+   irqfd_inject(irqfd);
 
/*
 * do not drop the file until the irqfd is fully initialized, otherwise

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[KVM PATCH 2/2] KVM: Remove unecessary irqfd-cleanup-wq

2009-10-21 Thread Gregory Haskins
We originally created the irqfd-cleanup-wq so that we could safely
implement a shutdown that blocked on outstanding injection-requests
that may have been in flight.  We couldn't reuse something like kevent
to shutdown since the injection path was also using kevent and that may
have caused a deadlock if we tried.

Since the injection path is now no longer utilizing a work-item, it is
no longer necessary to maintain a separate cleanup WQ.  The standard
kevent queues should be sufficient, and thus we can eliminate an extra
kthread from the system.

Signed-off-by: Gregory Haskins ghask...@novell.com
---

 virt/kvm/eventfd.c |   30 +++---
 1 files changed, 3 insertions(+), 27 deletions(-)

diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
index 1a529d4..fb698f4 100644
--- a/virt/kvm/eventfd.c
+++ b/virt/kvm/eventfd.c
@@ -52,8 +52,6 @@ struct _irqfd {
struct work_structshutdown;
 };
 
-static struct workqueue_struct *irqfd_cleanup_wq;
-
 static void
 irqfd_inject(struct _irqfd *irqfd)
 {
@@ -104,7 +102,7 @@ irqfd_deactivate(struct _irqfd *irqfd)
 
list_del_init(irqfd-list);
 
-   queue_work(irqfd_cleanup_wq, irqfd-shutdown);
+   schedule_work(irqfd-shutdown);
 }
 
 /*
@@ -262,7 +260,7 @@ kvm_irqfd_deassign(struct kvm *kvm, int fd, int gsi)
 * so that we guarantee there will not be any more interrupts on this
 * gsi once this deassign function returns.
 */
-   flush_workqueue(irqfd_cleanup_wq);
+   flush_scheduled_work();
 
return 0;
 }
@@ -296,33 +294,11 @@ kvm_irqfd_release(struct kvm *kvm)
 * Block until we know all outstanding shutdown jobs have completed
 * since we do not take a kvm* reference.
 */
-   flush_workqueue(irqfd_cleanup_wq);
+   flush_scheduled_work();
 
 }
 
 /*
- * create a host-wide workqueue for issuing deferred shutdown requests
- * aggregated from all vm* instances. We need our own isolated single-thread
- * queue to prevent deadlock against flushing the normal work-queue.
- */
-static int __init irqfd_module_init(void)
-{
-   irqfd_cleanup_wq = create_singlethread_workqueue(kvm-irqfd-cleanup);
-   if (!irqfd_cleanup_wq)
-   return -ENOMEM;
-
-   return 0;
-}
-
-static void __exit irqfd_module_exit(void)
-{
-   destroy_workqueue(irqfd_cleanup_wq);
-}
-
-module_init(irqfd_module_init);
-module_exit(irqfd_module_exit);
-
-/*
  * 
  * ioeventfd: translate a PIO/MMIO memory write to an eventfd signal.
  *

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Nested VMX support v3

2009-10-21 Thread Orit Wasserman


Avi Kivity a...@redhat.com wrote on 20/10/2009 05:30:34:

 From:

 Avi Kivity a...@redhat.com

 To:

 Orit Wasserman/Haifa/i...@ibmil

 Cc:

 kvm@vger.kernel.org, Ben-Ami Yassour1/Haifa/i...@ibmil, Abel Gordon/
 Haifa/i...@ibmil, Muli Ben-Yehuda/Haifa/i...@ibmil,
 aligu...@us.ibm.com, md...@us.ibm.com

 Date:

 20/10/2009 05:30

 Subject:

 Re: Nested VMX support v3

 On 10/15/2009 11:41 PM, or...@il.ibm.com wrote:
  Avi,
  We have addressed all of the comments, please apply.
 
  The following patches implement nested VMX support. The patches
 enable a guest
  to use the VMX APIs in order to run its own nested guest (i.e.,
 enable running
  other hypervisors which use VMX under KVM). The current patches
 support running
  Linux under a nested KVM using shadow page table (with bypass_guest_pf
  disabled). SMP support was fixed.  Reworking EPT support to mesh
 cleanly with
  the current shadow paging design per Avi's comments is a
work-in-progress.
 

 Why is bypass_guest_pf disabled?
It was not implemented.
We need to modify the walk_addr code to handle the sptes that have invalid
content (used only for the bypass_guest_pf
optimization) and identify them as not present. Maybe remove some other
validity checks too.

  The current patches only support a single nested hypervisor, which
 can only run
  a single guest (multiple guests are work in progress). Only 64-bit
nested
  hypervisors are supported.
 

 Multiple guests and 32-bit support are merge requirements.  As far as I
 can tell there shouldn't be anything special required to support them?
Ok.


  vpid allocation will be updated with the multiguest support (work
 in progress).
  We are working on fixing the cr0.TS handling, it works for nested kvm
by not
  for vmware server.
 

 Please either drop or fix vpid before merging.  What's wrong with
 cr0.ts?  I'd like to see that fixed as well.
Ok.

 --
 I have a truly marvellous patch that fixes the bug which this
 signature is too narrow to contain.


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 00/27] Add KVM support for Book3s_64 (PPC64) hosts v5

2009-10-21 Thread Alexander Graf
KVM for PowerPC only supports embedded cores at the moment.

While it makes sense to virtualize on small machines, it's even more fun
to do so on big boxes. So I figured we need KVM for PowerPC64 as well.

This patchset implements KVM support for Book3s_64 hosts and guest support
for Book3s_64 and G3/G4.

To really make use of this, you also need a recent version of qemu.


Don't want to apply patches? Get the git tree!

$ git clone git://csgraf.de/kvm
$ git checkout origin/ppc-v4

V1 - V2:

 - extend sregs with padding
 - new naming scheme (ppc64 - book3s_64; 74xx - book3s_32)
 - to_phys - in-kernel tophys()
 - loadimm - LOAD_REG_IMMEDIATE
 - call .ko kvm.ko
 - set magic paca bit later
 - run guest code with PACA-soft_enabled=true
 - pt_regs for host state saving (guest too?)
 - only do HV dcbz trick on 970
 - refuse to run on LPAR because of missing SLB pieces

V2 - V3:

 - fix DAR/DSISR saving
 - allow running on LPAR by modifying the SLB shadow
 - change the SLB implementation to use a mem-backed cache and do
   full world switch on enter/exit. gets rid of context magic
 - be more aggressive about DEC injection
 - remove fast ld/st because we're always in host context
 - don't use SPRGs in real-paged transition
 - implement dirty log
 - remove MMIO speedup code
 - SPRG cleanup
   - rename SPRG3 - SPRN_SPRG_PACA
   - rename SPRG1 - SPRN_SPRG_SCRATCH0
   - don't use SPRG2

V3 - V4:

 - use context_id instead of mm_alloc
 - export less

V4 - V5:

 - use get_tb instead of mftb
 - make ppc32 and ppc64 emulation share more code
 - make pvr 32 bits
 - add patch to use hrtimer for decrememter

Alexander Graf (27):
  Move dirty logging code to sub-arch
  Pass PVR in sregs
  Add Book3s definitions
  Add Book3s fields to vcpu structs
  Add asm/kvm_book3s.h
  Add Book3s_64 intercept helpers
  Add book3s_64 highmem asm code
  Add SLB switching code for entry/exit
  Add interrupt handling code
  Add book3s.c
  Add book3s_64 Host MMU handling
  Add book3s_64 guest MMU
  Add book3s_32 guest MMU
  Add book3s_64 specific opcode emulation
  Add mfdec emulation
  Add desktop PowerPC specific emulation
  Make head_64.S aware of KVM real mode code
  Add Book3s_64 offsets to asm-offsets.c
  Export symbols for KVM module
  Split init_new_context and destroy_context
  Export KVM symbols for module
  Add fields to PACA
  Export new PACA constants in asm-offsets
  Include Book3s_64 target in buildsystem
  Fix trace.h
  Use Little Endian for Dirty Bitmap
  Use hrtimers for the decrementer

 arch/powerpc/include/asm/exception-64s.h |2 +
 arch/powerpc/include/asm/kvm.h   |2 +
 arch/powerpc/include/asm/kvm_asm.h   |   39 ++
 arch/powerpc/include/asm/kvm_book3s.h|  136 
 arch/powerpc/include/asm/kvm_book3s_64_asm.h |   58 ++
 arch/powerpc/include/asm/kvm_host.h  |   79 +++-
 arch/powerpc/include/asm/kvm_ppc.h   |1 +
 arch/powerpc/include/asm/mmu_context.h   |5 +
 arch/powerpc/include/asm/paca.h  |9 +
 arch/powerpc/kernel/asm-offsets.c|   18 +
 arch/powerpc/kernel/exceptions-64s.S |8 +
 arch/powerpc/kernel/head_64.S|7 +
 arch/powerpc/kernel/ppc_ksyms.c  |3 +-
 arch/powerpc/kernel/time.c   |1 +
 arch/powerpc/kvm/Kconfig |   17 +
 arch/powerpc/kvm/Makefile|   27 +-
 arch/powerpc/kvm/book3s.c|  919 ++
 arch/powerpc/kvm/book3s_32_mmu.c |  354 ++
 arch/powerpc/kvm/book3s_64_emulate.c |  338 ++
 arch/powerpc/kvm/book3s_64_exports.c |   24 +
 arch/powerpc/kvm/book3s_64_interrupts.S  |  392 +++
 arch/powerpc/kvm/book3s_64_mmu.c |  469 +
 arch/powerpc/kvm/book3s_64_mmu_host.c|  412 
 arch/powerpc/kvm/book3s_64_rmhandlers.S  |  131 
 arch/powerpc/kvm/book3s_64_slb.S |  277 
 arch/powerpc/kvm/booke.c |5 +
 arch/powerpc/kvm/emulate.c   |   66 ++-
 arch/powerpc/kvm/powerpc.c   |   25 +-
 arch/powerpc/kvm/trace.h |6 +-
 arch/powerpc/mm/hash_utils_64.c  |2 +
 arch/powerpc/mm/mmu_context_hash64.c |   24 +-
 virt/kvm/kvm_main.c  |5 +-
 32 files changed, 3827 insertions(+), 34 deletions(-)
 create mode 100644 arch/powerpc/include/asm/kvm_book3s.h
 create mode 100644 arch/powerpc/include/asm/kvm_book3s_64_asm.h
 create mode 100644 arch/powerpc/kvm/book3s.c
 create mode 100644 arch/powerpc/kvm/book3s_32_mmu.c
 create mode 100644 arch/powerpc/kvm/book3s_64_emulate.c
 create mode 100644 arch/powerpc/kvm/book3s_64_exports.c
 create mode 100644 arch/powerpc/kvm/book3s_64_interrupts.S
 create mode 100644 arch/powerpc/kvm/book3s_64_mmu.c
 create mode 100644 arch/powerpc/kvm/book3s_64_mmu_host.c
 create mode 100644 arch/powerpc/kvm/book3s_64_rmhandlers.S
 

[PATCH 06/27] Add Book3s_64 intercept helpers

2009-10-21 Thread Alexander Graf
We need to intercept interrupt vectors. To do that, let's add a file
we can always include which only activates the intercepts when we have
then configured.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_book3s_64_asm.h |   58 ++
 1 files changed, 58 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/include/asm/kvm_book3s_64_asm.h

diff --git a/arch/powerpc/include/asm/kvm_book3s_64_asm.h 
b/arch/powerpc/include/asm/kvm_book3s_64_asm.h
new file mode 100644
index 000..2e06ee8
--- /dev/null
+++ b/arch/powerpc/include/asm/kvm_book3s_64_asm.h
@@ -0,0 +1,58 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2009
+ *
+ * Authors: Alexander Graf ag...@suse.de
+ */
+
+#ifndef __ASM_KVM_BOOK3S_ASM_H__
+#define __ASM_KVM_BOOK3S_ASM_H__
+
+#ifdef CONFIG_KVM_BOOK3S_64_HANDLER
+
+#include asm/kvm_asm.h
+
+.macro DO_KVM intno
+   .if (\intno == BOOK3S_INTERRUPT_SYSTEM_RESET) || \
+   (\intno == BOOK3S_INTERRUPT_MACHINE_CHECK) || \
+   (\intno == BOOK3S_INTERRUPT_DATA_STORAGE) || \
+   (\intno == BOOK3S_INTERRUPT_INST_STORAGE) || \
+   (\intno == BOOK3S_INTERRUPT_DATA_SEGMENT) || \
+   (\intno == BOOK3S_INTERRUPT_INST_SEGMENT) || \
+   (\intno == BOOK3S_INTERRUPT_EXTERNAL) || \
+   (\intno == BOOK3S_INTERRUPT_ALIGNMENT) || \
+   (\intno == BOOK3S_INTERRUPT_PROGRAM) || \
+   (\intno == BOOK3S_INTERRUPT_FP_UNAVAIL) || \
+   (\intno == BOOK3S_INTERRUPT_DECREMENTER) || \
+   (\intno == BOOK3S_INTERRUPT_SYSCALL) || \
+   (\intno == BOOK3S_INTERRUPT_TRACE) || \
+   (\intno == BOOK3S_INTERRUPT_PERFMON) || \
+   (\intno == BOOK3S_INTERRUPT_ALTIVEC) || \
+   (\intno == BOOK3S_INTERRUPT_VSX)
+
+   b   kvmppc_trampoline_\intno
+kvmppc_resume_\intno:
+
+   .endif
+.endm
+
+#else
+
+.macro DO_KVM intno
+.endm
+
+#endif /* CONFIG_KVM_BOOK3S_64_HANDLER */
+
+#endif /* __ASM_KVM_BOOK3S_ASM_H__ */
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 09/27] Add interrupt handling code

2009-10-21 Thread Alexander Graf
Getting from host state to the guest is only half the story. We also need
to return to our host context and handle whatever happened to get us out of
the guest.

On PowerPC every guest exit is an interrupt. So all we need to do is trap
the host's interrupt handlers and get into our #VMEXIT code to handle it.

PowerPCs also have a register that can add an offset to the interrupt handlers'
adresses which is what the booke KVM code uses. Unfortunately that is a
hypervisor ressource and we also want to be able to run KVM when we're running
in an LPAR. So we have to hook into the Linux interrupt handlers.

Signed-off-by: Alexander Graf ag...@suse.de

---

v3 - v4:

  - header rename fix
---
 arch/powerpc/kvm/book3s_64_rmhandlers.S |  131 +++
 1 files changed, 131 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/kvm/book3s_64_rmhandlers.S

diff --git a/arch/powerpc/kvm/book3s_64_rmhandlers.S 
b/arch/powerpc/kvm/book3s_64_rmhandlers.S
new file mode 100644
index 000..fb7dd2e
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_64_rmhandlers.S
@@ -0,0 +1,131 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2009
+ *
+ * Authors: Alexander Graf ag...@suse.de
+ */
+
+#include asm/ppc_asm.h
+#include asm/kvm_asm.h
+#include asm/reg.h
+#include asm/page.h
+#include asm/asm-offsets.h
+#include asm/exception-64s.h
+
+/*
+ *   *
+ *Real Mode handlers that need to be in low physical memory  *
+ *   *
+ /
+
+
+.macro INTERRUPT_TRAMPOLINE intno
+
+.global kvmppc_trampoline_\intno
+kvmppc_trampoline_\intno:
+
+   mtspr   SPRN_SPRG_SCRATCH0, r13 /* Save r13 */
+
+   /*
+* First thing to do is to find out if we're coming
+* from a KVM guest or a Linux process.
+*
+* To distinguish, we check a magic byte in the PACA
+*/
+   mfspr   r13, SPRN_SPRG_PACA /* r13 = PACA */
+   std r12, (PACA_EXMC + EX_R12)(r13)
+   mfcrr12
+   stw r12, (PACA_EXMC + EX_CCR)(r13)
+   lbz r12, PACA_KVM_IN_GUEST(r13)
+   cmpwi   r12, 0
+   bne ..kvmppc_handler_hasmagic_\intno
+   /* No KVM guest? Then jump back to the Linux handler! */
+   lwz r12, (PACA_EXMC + EX_CCR)(r13)
+   mtcrr12
+   ld  r12, (PACA_EXMC + EX_R12)(r13)
+   mfspr   r13, SPRN_SPRG_SCRATCH0 /* r13 = original r13 */
+   b   kvmppc_resume_\intno/* Get back original handler */
+
+   /* Now we know we're handling a KVM guest */
+..kvmppc_handler_hasmagic_\intno:
+   /* Unset guest state */
+   li  r12, 0
+   stb r12, PACA_KVM_IN_GUEST(r13)
+
+   std r1, (PACA_EXMC+EX_R9)(r13)
+   std r10, (PACA_EXMC+EX_R10)(r13)
+   std r11, (PACA_EXMC+EX_R11)(r13)
+   std r2, (PACA_EXMC+EX_R13)(r13)
+
+   mfsrr0  r10
+   mfsrr1  r11
+
+   /* Restore R1/R2 so we can handle faults */
+   ld  r1, PACAR1(r13)
+   ld  r2, (PACA_EXMC+EX_SRR0)(r13)
+
+   /* Let's store which interrupt we're handling */
+   li  r12, \intno
+
+   /* Jump into the SLB exit code that goes to the highmem handler */
+   b   kvmppc_handler_trampoline_exit
+
+.endm
+
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_SYSTEM_RESET
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_MACHINE_CHECK
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_DATA_STORAGE
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_DATA_SEGMENT
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_INST_STORAGE
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_INST_SEGMENT
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_EXTERNAL
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_ALIGNMENT
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_PROGRAM
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_FP_UNAVAIL
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_DECREMENTER
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_SYSCALL
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_TRACE
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_PERFMON
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_ALTIVEC
+INTERRUPT_TRAMPOLINE   

[PATCH 01/27] Move dirty logging code to sub-arch

2009-10-21 Thread Alexander Graf
PowerPC code handles dirty logging in the generic parts atm. While this
is great for return -ENOTSUPP, we need to be rather target specific
when actually implementing it.

So let's split it to implementation specific code, so we can implement
it for book3s.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/booke.c   |5 +
 arch/powerpc/kvm/powerpc.c |5 -
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index e7bf4d0..06f5a9e 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -520,6 +520,11 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu,
return kvmppc_core_vcpu_translate(vcpu, tr);
 }
 
+int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
+{
+   return -ENOTSUPP;
+}
+
 int __init kvmppc_booke_init(void)
 {
unsigned long ivor[16];
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 5902bbc..4ae3490 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -410,11 +410,6 @@ out:
return r;
 }
 
-int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
-{
-   return -ENOTSUPP;
-}
-
 long kvm_arch_vm_ioctl(struct file *filp,
unsigned int ioctl, unsigned long arg)
 {
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/27] Add book3s.c

2009-10-21 Thread Alexander Graf
This adds the book3s core handling file. Here everything that is generic to
desktop PowerPC cores is handled, including interrupt injections, MSR settings,
etc.

It basically takes over the same role as booke.c for embedded PowerPCs.

Signed-off-by: Alexander Graf ag...@suse.de

---

v3 - v4:

  - use context_id instead of mm_alloc

v4 - v5:

  - make pvr 32 bits
---
 arch/powerpc/kvm/book3s.c |  919 +
 1 files changed, 919 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/kvm/book3s.c

diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
new file mode 100644
index 000..0f4305b
--- /dev/null
+++ b/arch/powerpc/kvm/book3s.c
@@ -0,0 +1,919 @@
+/*
+ * Copyright (C) 2009. SUSE Linux Products GmbH. All rights reserved.
+ *
+ * Authors:
+ *Alexander Graf ag...@suse.de
+ *Kevin Wolf m...@kevin-wolf.de
+ *
+ * Description:
+ * This file is derived from arch/powerpc/kvm/44x.c,
+ * by Hollis Blanchard holl...@us.ibm.com.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ */
+
+#include linux/kvm_host.h
+#include linux/err.h
+
+#include asm/reg.h
+#include asm/cputable.h
+#include asm/cacheflush.h
+#include asm/tlbflush.h
+#include asm/uaccess.h
+#include asm/io.h
+#include asm/kvm_ppc.h
+#include asm/kvm_book3s.h
+#include asm/mmu_context.h
+#include linux/sched.h
+#include linux/vmalloc.h
+
+#define VCPU_STAT(x) offsetof(struct kvm_vcpu, stat.x), KVM_STAT_VCPU
+
+// #define EXIT_DEBUG
+// #define EXIT_DEBUG_SIMPLE
+
+// #define AGGRESSIVE_DEC
+
+struct kvm_stats_debugfs_item debugfs_entries[] = {
+   { exits,   VCPU_STAT(sum_exits) },
+   { mmio,VCPU_STAT(mmio_exits) },
+   { sig, VCPU_STAT(signal_exits) },
+   { sysc,VCPU_STAT(syscall_exits) },
+   { inst_emu,VCPU_STAT(emulated_inst_exits) },
+   { dec, VCPU_STAT(dec_exits) },
+   { ext_intr,VCPU_STAT(ext_intr_exits) },
+   { queue_intr,  VCPU_STAT(queue_intr) },
+   { halt_wakeup, VCPU_STAT(halt_wakeup) },
+   { pf_storage,  VCPU_STAT(pf_storage) },
+   { sp_storage,  VCPU_STAT(sp_storage) },
+   { pf_instruc,  VCPU_STAT(pf_instruc) },
+   { sp_instruc,  VCPU_STAT(sp_instruc) },
+   { ld,  VCPU_STAT(ld) },
+   { ld_slow, VCPU_STAT(ld_slow) },
+   { st,  VCPU_STAT(st) },
+   { st_slow, VCPU_STAT(st_slow) },
+   { NULL }
+};
+
+void kvmppc_core_load_host_debugstate(struct kvm_vcpu *vcpu)
+{
+}
+
+void kvmppc_core_load_guest_debugstate(struct kvm_vcpu *vcpu)
+{
+}
+
+void kvmppc_core_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
+{
+   memcpy(get_paca()-kvm_slb, to_book3s(vcpu)-slb_shadow, 
sizeof(get_paca()-kvm_slb));
+   get_paca()-kvm_slb_max = to_book3s(vcpu)-slb_shadow_max;
+}
+
+void kvmppc_core_vcpu_put(struct kvm_vcpu *vcpu)
+{
+   memcpy(to_book3s(vcpu)-slb_shadow, get_paca()-kvm_slb, 
sizeof(get_paca()-kvm_slb));
+   to_book3s(vcpu)-slb_shadow_max = get_paca()-kvm_slb_max;
+}
+
+#if defined(AGGRESSIVE_DEC) || defined(EXIT_DEBUG)
+static u32 kvmppc_get_dec(struct kvm_vcpu *vcpu)
+{
+   u64 jd = mftb() - vcpu-arch.dec_jiffies;
+   return vcpu-arch.dec - jd;
+}
+#endif
+
+void kvmppc_set_msr(struct kvm_vcpu *vcpu, u64 msr)
+{
+   ulong old_msr = vcpu-arch.msr;
+
+#ifdef EXIT_DEBUG
+   printk(KERN_INFO KVM: Set MSR to 0x%llx\n, msr);
+#endif
+   msr = to_book3s(vcpu)-msr_mask;
+   vcpu-arch.msr = msr;
+   vcpu-arch.shadow_msr = msr | MSR_USER32;
+   vcpu-arch.shadow_msr = ( MSR_VEC | MSR_VSX | MSR_FP | MSR_FE0 |
+  MSR_USER64 | MSR_SE | MSR_BE | MSR_DE |
+  MSR_FE1);
+
+   if (msr  (MSR_WE|MSR_POW)) {
+   if (!vcpu-arch.pending_exceptions) {
+   kvm_vcpu_block(vcpu);
+   vcpu-stat.halt_wakeup++;
+   }
+   }
+
+   if (((vcpu-arch.msr  (MSR_IR|MSR_DR)) != (old_msr  (MSR_IR|MSR_DR))) 
||
+   (vcpu-arch.msr  MSR_PR) != (old_msr  MSR_PR)) {
+   kvmppc_mmu_flush_segments(vcpu);
+   kvmppc_mmu_map_segment(vcpu, vcpu-arch.pc);
+   }
+}
+
+void kvmppc_inject_interrupt(struct kvm_vcpu *vcpu, int vec, u64 flags)
+{
+   vcpu-arch.srr0 = vcpu-arch.pc;
+   vcpu-arch.srr1 = vcpu-arch.msr | flags;
+   vcpu-arch.pc = to_book3s(vcpu)-hior + vec;
+   vcpu-arch.mmu.reset_msr(vcpu);
+}
+
+void kvmppc_book3s_queue_irqprio(struct kvm_vcpu *vcpu, unsigned int vec)
+{
+   unsigned int prio;
+
+   vcpu-stat.queue_intr++;
+   switch (vec) {
+   case 0x100: prio = BOOK3S_IRQPRIO_SYSTEM_RESET; break;
+   case 0x200: prio = BOOK3S_IRQPRIO_MACHINE_CHECK;break;
+   case 0x300: prio = BOOK3S_IRQPRIO_DATA_STORAGE; break;
+   case 0x380: prio = 

[PATCH 26/27] Use Little Endian for Dirty Bitmap

2009-10-21 Thread Alexander Graf
We currently use host endian long types to store information
in the dirty bitmap.

This works reasonably well on Little Endian targets, because the
u32 after the first contains the next 32 bits. On Big Endian this
breaks completely though, forcing us to be inventive here.

So Ben suggested to always use Little Endian, which looks reasonable.

We only have dirty bitmap implemented in Little Endian targets so far
and since PowerPC would be the first Big Endian platform, we can just
as well switch to Little Endian always with little effort without
breaking existing targets.

Signed-off-by: Alexander Graf ag...@suse.de
---
 virt/kvm/kvm_main.c |5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 54a272f..c565e5b 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -49,6 +49,7 @@
 #include asm/io.h
 #include asm/uaccess.h
 #include asm/pgtable.h
+#include asm-generic/bitops/le.h
 
 #ifdef KVM_COALESCED_MMIO_PAGE_OFFSET
 #include coalesced_mmio.h
@@ -1071,8 +1072,8 @@ void mark_page_dirty(struct kvm *kvm, gfn_t gfn)
unsigned long rel_gfn = gfn - memslot-base_gfn;
 
/* avoid RMW */
-   if (!test_bit(rel_gfn, memslot-dirty_bitmap))
-   set_bit(rel_gfn, memslot-dirty_bitmap);
+   if (!generic_test_le_bit(rel_gfn, memslot-dirty_bitmap))
+   generic___set_le_bit(rel_gfn, memslot-dirty_bitmap);
}
 }
 
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 25/27] Fix trace.h

2009-10-21 Thread Alexander Graf
It looks like the variable pc is defined. At least the current code always
failed on me stating that pc is already defined somewhere else.

Let's use _pc instead, because that doesn't collide.

Is this the right approach? Does it break on 440 too? If not, why not?

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/trace.h |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kvm/trace.h b/arch/powerpc/kvm/trace.h
index 67f219d..a8e8400 100644
--- a/arch/powerpc/kvm/trace.h
+++ b/arch/powerpc/kvm/trace.h
@@ -12,8 +12,8 @@
  * Tracepoint for guest mode entry.
  */
 TRACE_EVENT(kvm_ppc_instr,
-   TP_PROTO(unsigned int inst, unsigned long pc, unsigned int emulate),
-   TP_ARGS(inst, pc, emulate),
+   TP_PROTO(unsigned int inst, unsigned long _pc, unsigned int emulate),
+   TP_ARGS(inst, _pc, emulate),
 
TP_STRUCT__entry(
__field(unsigned int,   inst)
@@ -23,7 +23,7 @@ TRACE_EVENT(kvm_ppc_instr,
 
TP_fast_assign(
__entry-inst   = inst;
-   __entry-pc = pc;
+   __entry-pc = _pc;
__entry-emulate= emulate;
),
 
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 21/27] Export KVM symbols for module

2009-10-21 Thread Alexander Graf
To be able to keep KVM as module, we need to export the SLB trampoline
addresses to the module, so it knows where to jump to.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_64_exports.c |   24 
 1 files changed, 24 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/kvm/book3s_64_exports.c

diff --git a/arch/powerpc/kvm/book3s_64_exports.c 
b/arch/powerpc/kvm/book3s_64_exports.c
new file mode 100644
index 000..5b2db38
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_64_exports.c
@@ -0,0 +1,24 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2009
+ *
+ * Authors: Alexander Graf ag...@suse.de
+ */
+
+#include linux/module.h
+#include asm/kvm_book3s.h
+
+EXPORT_SYMBOL_GPL(kvmppc_trampoline_enter);
+EXPORT_SYMBOL_GPL(kvmppc_trampoline_lowmem);
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 03/27] Add Book3s definitions

2009-10-21 Thread Alexander Graf
We need quite a bunch of new constants for KVM on Book3s,
so let's define them now.

These constants will be used in later patches.

Signed-off-by: Alexander Graf ag...@suse.de

---

v3 - v4

  - remove old kernel compat code
---
 arch/powerpc/include/asm/kvm_asm.h |   39 
 1 files changed, 39 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_asm.h 
b/arch/powerpc/include/asm/kvm_asm.h
index 56bfae5..19ddb35 100644
--- a/arch/powerpc/include/asm/kvm_asm.h
+++ b/arch/powerpc/include/asm/kvm_asm.h
@@ -49,6 +49,45 @@
 #define BOOKE_INTERRUPT_SPE_FP_ROUND 34
 #define BOOKE_INTERRUPT_PERFORMANCE_MONITOR 35
 
+/* book3s */
+
+#define BOOK3S_INTERRUPT_SYSTEM_RESET  0x100
+#define BOOK3S_INTERRUPT_MACHINE_CHECK 0x200
+#define BOOK3S_INTERRUPT_DATA_STORAGE  0x300
+#define BOOK3S_INTERRUPT_DATA_SEGMENT  0x380
+#define BOOK3S_INTERRUPT_INST_STORAGE  0x400
+#define BOOK3S_INTERRUPT_INST_SEGMENT  0x480
+#define BOOK3S_INTERRUPT_EXTERNAL  0x500
+#define BOOK3S_INTERRUPT_ALIGNMENT 0x600
+#define BOOK3S_INTERRUPT_PROGRAM   0x700
+#define BOOK3S_INTERRUPT_FP_UNAVAIL0x800
+#define BOOK3S_INTERRUPT_DECREMENTER   0x900
+#define BOOK3S_INTERRUPT_SYSCALL   0xc00
+#define BOOK3S_INTERRUPT_TRACE 0xd00
+#define BOOK3S_INTERRUPT_PERFMON   0xf00
+#define BOOK3S_INTERRUPT_ALTIVEC   0xf20
+#define BOOK3S_INTERRUPT_VSX   0xf40
+
+#define BOOK3S_IRQPRIO_SYSTEM_RESET0
+#define BOOK3S_IRQPRIO_DATA_SEGMENT1
+#define BOOK3S_IRQPRIO_INST_SEGMENT2
+#define BOOK3S_IRQPRIO_DATA_STORAGE3
+#define BOOK3S_IRQPRIO_INST_STORAGE4
+#define BOOK3S_IRQPRIO_ALIGNMENT   5
+#define BOOK3S_IRQPRIO_PROGRAM 6
+#define BOOK3S_IRQPRIO_FP_UNAVAIL  7
+#define BOOK3S_IRQPRIO_ALTIVEC 8
+#define BOOK3S_IRQPRIO_VSX 9
+#define BOOK3S_IRQPRIO_SYSCALL 10
+#define BOOK3S_IRQPRIO_MACHINE_CHECK   11
+#define BOOK3S_IRQPRIO_DEBUG   12
+#define BOOK3S_IRQPRIO_EXTERNAL13
+#define BOOK3S_IRQPRIO_DECREMENTER 14
+#define BOOK3S_IRQPRIO_PERFORMANCE_MONITOR 15
+#define BOOK3S_IRQPRIO_MAX 16
+
+#define BOOK3S_HFLAG_DCBZ320x1
+
 #define RESUME_FLAG_NV  (10)  /* Reload guest nonvolatile state? */
 #define RESUME_FLAG_HOST(11)  /* Resume host? */
 
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 22/27] Add fields to PACA

2009-10-21 Thread Alexander Graf
For KVM we need to store some information in the PACA, so we
need to extend it.

This patch adds KVM SLB shadow related entries to the PACA and
a field that indicates if we're inside a guest.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/paca.h |9 +
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index 7d8514c..5e9b4ef 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -129,6 +129,15 @@ struct paca_struct {
u64 system_time;/* accumulated system TB ticks */
u64 startpurr;  /* PURR/TB value snapshot */
u64 startspurr; /* SPURR value snapshot */
+
+#ifdef CONFIG_KVM_BOOK3S_64_HANDLER
+   struct  {
+   u64 esid;
+   u64 vsid;
+   } kvm_slb[64];  /* guest SLB */
+   u8 kvm_slb_max; /* highest used guest slb entry */
+   u8 kvm_in_guest;/* are we inside the guest? */
+#endif
 };
 
 extern struct paca_struct paca[];
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 20/27] Split init_new_context and destroy_context

2009-10-21 Thread Alexander Graf
For KVM we need to allocate a new context id, but don't really care about
all the mm context around it.

So let's split the alloc and destroy functions for the context id, so we can
grab one without allocating an mm context.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/mmu_context.h |5 +
 arch/powerpc/mm/mmu_context_hash64.c   |   24 +---
 2 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu_context.h 
b/arch/powerpc/include/asm/mmu_context.h
index b34e94d..66b35d0 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -23,6 +23,11 @@ extern void switch_slb(struct task_struct *tsk, struct 
mm_struct *mm);
 extern void set_context(unsigned long id, pgd_t *pgd);
 
 #ifdef CONFIG_PPC_BOOK3S_64
+extern int __init_new_context(void);
+extern void __destroy_context(int context_id);
+#endif
+
+#ifdef CONFIG_PPC_BOOK3S_64
 static inline void mmu_context_init(void) { }
 #else
 extern void mmu_context_init(void);
diff --git a/arch/powerpc/mm/mmu_context_hash64.c 
b/arch/powerpc/mm/mmu_context_hash64.c
index dbeb86a..b9e4cc2 100644
--- a/arch/powerpc/mm/mmu_context_hash64.c
+++ b/arch/powerpc/mm/mmu_context_hash64.c
@@ -18,6 +18,7 @@
 #include linux/mm.h
 #include linux/spinlock.h
 #include linux/idr.h
+#include linux/module.h
 
 #include asm/mmu_context.h
 
@@ -32,7 +33,7 @@ static DEFINE_IDR(mmu_context_idr);
 #define NO_CONTEXT 0
 #define MAX_CONTEXT((1UL  19) - 1)
 
-int init_new_context(struct task_struct *tsk, struct mm_struct *mm)
+int __init_new_context(void)
 {
int index;
int err;
@@ -57,6 +58,18 @@ again:
return -ENOMEM;
}
 
+   return index;
+}
+EXPORT_SYMBOL_GPL(__init_new_context);
+
+int init_new_context(struct task_struct *tsk, struct mm_struct *mm)
+{
+   int index;
+
+   index = __init_new_context();
+   if (index  0)
+   return index;
+
/* The old code would re-promote on fork, we don't do that
 * when using slices as it could cause problem promoting slices
 * that have been forced down to 4K
@@ -68,11 +81,16 @@ again:
return 0;
 }
 
-void destroy_context(struct mm_struct *mm)
+void __destroy_context(int context_id)
 {
spin_lock(mmu_context_lock);
-   idr_remove(mmu_context_idr, mm-context.id);
+   idr_remove(mmu_context_idr, context_id);
spin_unlock(mmu_context_lock);
+}
+EXPORT_SYMBOL_GPL(__destroy_context);
 
+void destroy_context(struct mm_struct *mm)
+{
+   __destroy_context(mm-context.id);
mm-context.id = NO_CONTEXT;
 }
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 18/27] Add Book3s_64 offsets to asm-offsets.c

2009-10-21 Thread Alexander Graf
We need to access some VCPU fields from assembly code. In order to get
the proper offsets, we have to define them in asm-offsets.c.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kernel/asm-offsets.c |   13 +
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 0812b0f..aba3ea6 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -398,6 +398,19 @@ int main(void)
DEFINE(VCPU_LAST_INST, offsetof(struct kvm_vcpu, arch.last_inst));
DEFINE(VCPU_FAULT_DEAR, offsetof(struct kvm_vcpu, arch.fault_dear));
DEFINE(VCPU_FAULT_ESR, offsetof(struct kvm_vcpu, arch.fault_esr));
+
+   /* book3s_64 */
+#ifdef CONFIG_PPC64
+   DEFINE(VCPU_FAULT_DSISR, offsetof(struct kvm_vcpu, arch.fault_dsisr));
+   DEFINE(VCPU_HOST_RETIP, offsetof(struct kvm_vcpu, arch.host_retip));
+   DEFINE(VCPU_HOST_R2, offsetof(struct kvm_vcpu, arch.host_r2));
+   DEFINE(VCPU_HOST_MSR, offsetof(struct kvm_vcpu, arch.host_msr));
+   DEFINE(VCPU_SHADOW_MSR, offsetof(struct kvm_vcpu, arch.shadow_msr));
+   DEFINE(VCPU_TRAMPOLINE_LOWMEM, offsetof(struct kvm_vcpu, 
arch.trampoline_lowmem));
+   DEFINE(VCPU_TRAMPOLINE_ENTER, offsetof(struct kvm_vcpu, 
arch.trampoline_enter));
+   DEFINE(VCPU_HIGHMEM_HANDLER, offsetof(struct kvm_vcpu, 
arch.highmem_handler));
+   DEFINE(VCPU_HFLAGS, offsetof(struct kvm_vcpu, arch.hflags));
+#endif
 #endif
 #ifdef CONFIG_44x
DEFINE(PGD_T_LOG2, PGD_T_LOG2);
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 15/27] Add mfdec emulation

2009-10-21 Thread Alexander Graf
We support setting the DEC to a certain value right now. Doing that basically
triggers the CPU local timer.

But there's also an mfdec command that enabled the OS to read the decrementor.

This is required at least by all desktop and server PowerPC Linux kernels. It
can't really hurt to allow embedded ones to do it as well though.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/emulate.c |   13 -
 1 files changed, 12 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c
index 7737146..50d411d 100644
--- a/arch/powerpc/kvm/emulate.c
+++ b/arch/powerpc/kvm/emulate.c
@@ -66,12 +66,14 @@
 
 void kvmppc_emulate_dec(struct kvm_vcpu *vcpu)
 {
+   unsigned long nr_jiffies;
+
if (vcpu-arch.tcr  TCR_DIE) {
/* The decrementer ticks at the same rate as the timebase, so
 * that's how we convert the guest DEC value to the number of
 * host ticks. */
-   unsigned long nr_jiffies;
 
+   vcpu-arch.dec_jiffies = mftb();
nr_jiffies = vcpu-arch.dec / tb_ticks_per_jiffy;
mod_timer(vcpu-arch.dec_timer,
  get_jiffies_64() + nr_jiffies);
@@ -211,6 +213,15 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct 
kvm_vcpu *vcpu)
/* Note: SPRG4-7 are user-readable, so we don't get
 * a trap. */
 
+   case SPRN_DEC:
+   {
+   u64 jd = mftb() - vcpu-arch.dec_jiffies;
+   vcpu-arch.gpr[rt] = vcpu-arch.dec - jd;
+#ifdef DEBUG_EMUL
+   printk(KERN_INFO mfDEC: %x - %llx = %lx\n, 
vcpu-arch.dec, jd, vcpu-arch.gpr[rt]);
+#endif
+   break;
+   }
default:
emulated = kvmppc_core_emulate_mfspr(vcpu, 
sprn, rt);
if (emulated == EMULATE_FAIL) {
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 17/27] Make head_64.S aware of KVM real mode code

2009-10-21 Thread Alexander Graf
We need to run some KVM trampoline code in real mode. Unfortunately, real mode
only covers 8MB on Cell so we need to squeeze ourselves as low as possible.

Also, we need to trap interrupts to get us back from guest state to host state
without telling Linux about it.

This patch adds interrupt traps and includes the KVM code that requires real
mode in the real mode parts of Linux.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/exception-64s.h |2 ++
 arch/powerpc/kernel/exceptions-64s.S |8 
 arch/powerpc/kernel/head_64.S|7 +++
 3 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h 
b/arch/powerpc/include/asm/exception-64s.h
index a98653b..57c4000 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -147,6 +147,7 @@
.globl label##_pSeries; \
 label##_pSeries:   \
HMT_MEDIUM; \
+   DO_KVM  n;  \
mtspr   SPRN_SPRG_SCRATCH0,r13; /* save r13 */  \
EXCEPTION_PROLOG_PSERIES(PACA_EXGEN, label##_common)
 
@@ -170,6 +171,7 @@ label##_pSeries:\
.globl label##_pSeries; \
 label##_pSeries:   \
HMT_MEDIUM; \
+   DO_KVM  n;  \
mtspr   SPRN_SPRG_SCRATCH0,r13; /* save r13 */  \
mfspr   r13,SPRN_SPRG_PACA; /* get paca address into r13 */ \
std r9,PACA_EXGEN+EX_R9(r13);   /* save r9, r10 */  \
diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index 1808876..fc3ead0 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -41,6 +41,7 @@ __start_interrupts:
. = 0x200
 _machine_check_pSeries:
HMT_MEDIUM
+   DO_KVM  0x200
mtspr   SPRN_SPRG_SCRATCH0,r13  /* save r13 */
EXCEPTION_PROLOG_PSERIES(PACA_EXMC, machine_check_common)
 
@@ -48,6 +49,7 @@ _machine_check_pSeries:
.globl data_access_pSeries
 data_access_pSeries:
HMT_MEDIUM
+   DO_KVM  0x300
mtspr   SPRN_SPRG_SCRATCH0,r13
 BEGIN_FTR_SECTION
mfspr   r13,SPRN_SPRG_PACA
@@ -77,6 +79,7 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_SLB)
.globl data_access_slb_pSeries
 data_access_slb_pSeries:
HMT_MEDIUM
+   DO_KVM  0x380
mtspr   SPRN_SPRG_SCRATCH0,r13
mfspr   r13,SPRN_SPRG_PACA  /* get paca address into r13 */
std r3,PACA_EXSLB+EX_R3(r13)
@@ -115,6 +118,7 @@ data_access_slb_pSeries:
.globl instruction_access_slb_pSeries
 instruction_access_slb_pSeries:
HMT_MEDIUM
+   DO_KVM  0x480
mtspr   SPRN_SPRG_SCRATCH0,r13
mfspr   r13,SPRN_SPRG_PACA  /* get paca address into r13 */
std r3,PACA_EXSLB+EX_R3(r13)
@@ -154,6 +158,7 @@ instruction_access_slb_pSeries:
.globl  system_call_pSeries
 system_call_pSeries:
HMT_MEDIUM
+   DO_KVM  0xc00
 BEGIN_FTR_SECTION
cmpdi   r0,0x1ebe
beq-1f
@@ -186,12 +191,15 @@ END_FTR_SECTION_IFSET(CPU_FTR_REAL_LE)
 * trickery is thus necessary
 */
. = 0xf00
+   DO_KVM  0xf00
b   performance_monitor_pSeries
 
. = 0xf20
+   DO_KVM  0xf20
b   altivec_unavailable_pSeries
 
. = 0xf40
+   DO_KVM  0xf40
b   vsx_unavailable_pSeries
 
 #ifdef CONFIG_CBE_RAS
diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
index c38afdb..9258074 100644
--- a/arch/powerpc/kernel/head_64.S
+++ b/arch/powerpc/kernel/head_64.S
@@ -37,6 +37,7 @@
 #include asm/firmware.h
 #include asm/page_64.h
 #include asm/irqflags.h
+#include asm/kvm_book3s_64_asm.h
 
 /* The physical memory is layed out such that the secondary processor
  * spin code sits at 0x...0x00ff. On server, the vectors follow
@@ -165,6 +166,12 @@ exception_marker:
 #include exceptions-64s.S
 #endif
 
+/* KVM trampoline code needs to be close to the interrupt handlers */
+
+#ifdef CONFIG_KVM_BOOK3S_64_HANDLER
+#include ../kvm/book3s_64_rmhandlers.S
+#endif
+
 _GLOBAL(generic_secondary_thread_init)
mr  r24,r3
 
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 13/27] Add book3s_32 guest MMU

2009-10-21 Thread Alexander Graf
This patch adds an implementation for a G3/G4 MMU, so we can run G3 and
G4 guests in KVM on Book3s_64.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_32_mmu.c |  354 ++
 1 files changed, 354 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/kvm/book3s_32_mmu.c

diff --git a/arch/powerpc/kvm/book3s_32_mmu.c b/arch/powerpc/kvm/book3s_32_mmu.c
new file mode 100644
index 000..134c186
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_32_mmu.c
@@ -0,0 +1,354 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2009
+ *
+ * Authors: Alexander Graf ag...@suse.de
+ */
+
+#include linux/types.h
+#include linux/string.h
+#include linux/kvm.h
+#include linux/kvm_host.h
+#include linux/highmem.h
+
+#include asm/tlbflush.h
+#include asm/kvm_ppc.h
+#include asm/kvm_book3s.h
+
+// #define DEBUG_MMU
+// #define DEBUG_MMU_PTE
+// #define DEBUG_MMU_PTE_IP 0xfff14c40
+
+static int kvmppc_mmu_book3s_32_xlate_bat(struct kvm_vcpu *vcpu, gva_t eaddr, 
struct kvmppc_pte *pte, bool data);
+
+static struct kvmppc_sr *kvmppc_mmu_book3s_32_find_sr(
+   struct kvmppc_vcpu_book3s *vcpu_book3s, gva_t eaddr)
+{
+   return vcpu_book3s-sr[(eaddr  28)  0xf];
+}
+
+static u64 kvmppc_mmu_book3s_32_ea_to_vp(struct kvm_vcpu *vcpu, gva_t eaddr, 
bool data)
+{
+   struct kvmppc_sr *sre = kvmppc_mmu_book3s_32_find_sr(to_book3s(vcpu), 
eaddr);
+   struct kvmppc_pte pte;
+
+   if (!kvmppc_mmu_book3s_32_xlate_bat(vcpu, eaddr, pte, data))
+   return pte.vpage;
+
+   return (((u64)eaddr  12)  0x) | (((u64)sre-vsid)  16);
+}
+
+static void kvmppc_mmu_book3s_32_reset_msr(struct kvm_vcpu *vcpu)
+{
+   kvmppc_set_msr(vcpu, 0);
+}
+
+static hva_t kvmppc_mmu_book3s_32_get_pteg(struct kvmppc_vcpu_book3s 
*vcpu_book3s,
+ struct kvmppc_sr *sre, gva_t eaddr,
+ bool primary)
+{
+   u32 page, hash, pteg, htabmask;
+   hva_t r;
+
+   page = (eaddr  0x0FFF)  12;
+   htabmask = ((vcpu_book3s-sdr1  0x1FF)  16) | 0xFFC0;
+
+   hash = ((sre-vsid ^ page)  6);
+   if (!primary)
+   hash = ~hash;
+   hash = htabmask;
+
+   pteg = (vcpu_book3s-sdr1  0x) | hash;
+
+#ifdef DEBUG_MMU
+   printk(KERN_INFO MMU: pc=0x%lx eaddr=0x%lx sdr1=0x%llx pteg=0x%x 
vsid=0x%x\n,
+   vcpu_book3s-vcpu.arch.pc, eaddr, vcpu_book3s-sdr1, 
pteg, sre-vsid);
+#endif
+
+   r = gfn_to_hva(vcpu_book3s-vcpu.kvm, pteg  PAGE_SHIFT);
+   if (kvm_is_error_hva(r))
+   return r;
+   return r | (pteg  ~PAGE_MASK);
+}
+
+static u32 kvmppc_mmu_book3s_32_get_ptem(struct kvmppc_sr *sre, gva_t eaddr,
+   bool primary)
+{
+   return ((eaddr  0x0fff)  22) | (sre-vsid  7) |
+  (primary ? 0 : 0x40) | 0x8000;
+}
+
+static int kvmppc_mmu_book3s_32_xlate_bat(struct kvm_vcpu *vcpu, gva_t eaddr, 
struct kvmppc_pte *pte, bool data)
+{
+   struct kvmppc_vcpu_book3s *vcpu_book3s = to_book3s(vcpu);
+   struct kvmppc_bat *bat;
+   int i;
+
+   for (i = 0; i  8; i++) {
+   if (data)
+   bat = vcpu_book3s-dbat[i];
+   else
+   bat = vcpu_book3s-ibat[i];
+
+   if (vcpu-arch.msr  MSR_PR) {
+   if (!bat-vp)
+   continue;
+   } else {
+   if (!bat-vs)
+   continue;
+   }
+
+#ifdef DEBUG_MMU_PTE
+#ifdef DEBUG_MMU_PTE_IP
+   if (vcpu-arch.pc == DEBUG_MMU_PTE_IP)
+#endif
+   {
+   printk(KERN_INFO %cBAT %02d: 0x%lx - 0x%x (0x%x)\n, data ? 'd' : 'i', 
i, eaddr, bat-bepi, bat-bepi_mask);
+   }
+#endif
+   if ((eaddr  bat-bepi_mask) == bat-bepi) {
+   pte-raddr = bat-brpn | (eaddr  ~bat-bepi_mask);
+   pte-vpage = (eaddr  12) | VSID_BAT;
+   pte-may_read = bat-pp;
+   pte-may_write = bat-pp  1;
+   pte-may_execute = true;
+   if (!pte-may_read) {
+   printk(KERN_INFO BAT is not readable!\n);
+   continue;
+   }
+   

[PATCH 14/27] Add book3s_64 specific opcode emulation

2009-10-21 Thread Alexander Graf
There are generic parts of PowerPC that can be shared across all
implementations and specific parts that only apply to BookE or desktop PPCs.

This patch adds emulation for desktop specific opcodes that don't apply
to BookE CPUs.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_64_emulate.c |  338 ++
 1 files changed, 338 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/kvm/book3s_64_emulate.c

diff --git a/arch/powerpc/kvm/book3s_64_emulate.c 
b/arch/powerpc/kvm/book3s_64_emulate.c
new file mode 100644
index 000..60cd64a
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_64_emulate.c
@@ -0,0 +1,338 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2009
+ *
+ * Authors: Alexander Graf ag...@suse.de
+ */
+
+#include asm/kvm_ppc.h
+#include asm/disassemble.h
+#include asm/kvm_book3s.h
+#include asm/reg.h
+
+#define OP_19_XOP_RFID 18
+#define OP_19_XOP_RFI  50
+
+#define OP_31_XOP_MFMSR83
+#define OP_31_XOP_MTMSR146
+#define OP_31_XOP_MTMSRD   178
+#define OP_31_XOP_MTSRIN   242
+#define OP_31_XOP_TLBIEL   274
+#define OP_31_XOP_TLBIE306
+#define OP_31_XOP_SLBMTE   402
+#define OP_31_XOP_SLBIE434
+#define OP_31_XOP_SLBIA498
+#define OP_31_XOP_MFSRIN   659
+#define OP_31_XOP_SLBMFEV  851
+#define OP_31_XOP_EIOIO854
+#define OP_31_XOP_SLBMFEE  915
+// DCBZ is actually 1014, but we patch it to 1010 so we get a trap
+#define OP_31_XOP_DCBZ 1010
+
+int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu,
+   unsigned int inst, int *advance)
+{
+   int emulated = EMULATE_DONE;
+
+   switch (get_op(inst)) {
+   case 19:
+   switch (get_xop(inst)) {
+   case OP_19_XOP_RFID:
+   case OP_19_XOP_RFI:
+   vcpu-arch.pc = vcpu-arch.srr0;
+   kvmppc_set_msr(vcpu, vcpu-arch.srr1);
+   *advance = 0;
+   break;
+
+   default:
+   emulated = EMULATE_FAIL;
+   break;
+   }
+   break;
+   case 31:
+   switch (get_xop(inst)) {
+   case OP_31_XOP_MFMSR:
+   vcpu-arch.gpr[get_rt(inst)] = vcpu-arch.msr;
+   break;
+   case OP_31_XOP_MTMSRD:
+   {
+   ulong rs = vcpu-arch.gpr[get_rs(inst)];
+   if (inst  0x1) {
+   vcpu-arch.msr = ~(MSR_RI | MSR_EE);
+   vcpu-arch.msr |= rs  (MSR_RI | MSR_EE);
+   } else
+   kvmppc_set_msr(vcpu, rs);
+   break;
+   }
+   case OP_31_XOP_MTMSR:
+   kvmppc_set_msr(vcpu, vcpu-arch.gpr[get_rs(inst)]);
+   break;
+   case OP_31_XOP_MFSRIN:
+   {
+   int srnum;
+
+   srnum = (vcpu-arch.gpr[get_rb(inst)]  28)  0xf;
+   if (vcpu-arch.mmu.mfsrin) {
+   u32 sr;
+   sr = vcpu-arch.mmu.mfsrin(vcpu, srnum);
+   vcpu-arch.gpr[get_rt(inst)] = sr;
+   }
+   break;
+   }
+   case OP_31_XOP_MTSRIN:
+   vcpu-arch.mmu.mtsrin(vcpu,
+   (vcpu-arch.gpr[get_rb(inst)]  28)  0xf,
+   vcpu-arch.gpr[get_rs(inst)]);
+   break;
+   case OP_31_XOP_TLBIE:
+   case OP_31_XOP_TLBIEL:
+   {
+   bool large = (inst  0x0020) ? true : false;
+   ulong addr = vcpu-arch.gpr[get_rb(inst)];
+   vcpu-arch.mmu.tlbie(vcpu, addr, large);
+   break;
+   }
+   case OP_31_XOP_EIOIO:
+   break;
+   case OP_31_XOP_SLBMTE:
+   if (!vcpu-arch.mmu.slbmte)
+   return EMULATE_FAIL;
+
+

[PATCH 11/27] Add book3s_64 Host MMU handling

2009-10-21 Thread Alexander Graf
We designed the Book3S port of KVM as modular as possible. Most
of the code could be easily used on a Book3S_32 host as well.

The main difference between 32 and 64 bit cores is the MMU. To keep
things well separated, we treat the book3s_64 MMU as one possible compile
option.

This patch adds all the MMU helpers the rest of the code needs in
order to modify the host's MMU, like setting PTEs and segments.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_64_mmu_host.c |  412 +
 1 files changed, 412 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/kvm/book3s_64_mmu_host.c

diff --git a/arch/powerpc/kvm/book3s_64_mmu_host.c 
b/arch/powerpc/kvm/book3s_64_mmu_host.c
new file mode 100644
index 000..507f770
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_64_mmu_host.c
@@ -0,0 +1,412 @@
+/*
+ * Copyright (C) 2009 SUSE Linux Products GmbH. All rights reserved.
+ *
+ * Authors:
+ * Alexander Graf ag...@suse.de
+ * Kevin Wolf m...@kevin-wolf.de
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ */
+
+#include linux/kvm_host.h
+
+#include asm/kvm_ppc.h
+#include asm/kvm_book3s.h
+#include asm/mmu-hash64.h
+#include asm/machdep.h
+#include asm/mmu_context.h
+#include asm/hw_irq.h
+
+#define PTE_SIZE 12
+#define VSID_ALL 0
+
+// #define DEBUG_MMU
+// #define DEBUG_SLB
+
+void kvmppc_mmu_pte_flush(struct kvm_vcpu *vcpu, u64 guest_ea, u64 ea_mask)
+{
+   int i;
+
+#ifdef DEBUG_MMU
+   printk(KERN_INFO KVM: Flushing %d Shadow PTEs: 0x%llx  0x%llx\n,
+   vcpu-arch.hpte_cache_offset, guest_ea, ea_mask);
+#endif
+   BUG_ON(vcpu-arch.hpte_cache_offset  HPTEG_CACHE_NUM);
+   guest_ea = ea_mask;
+   for (i=0; ivcpu-arch.hpte_cache_offset; i++) {
+   struct hpte_cache *pte;
+
+   pte = vcpu-arch.hpte_cache[i];
+   if (!pte-host_va)
+   continue;
+
+   if ((pte-pte.eaddr  ea_mask) == guest_ea) {
+#ifdef DEBUG_MMU
+   printk(KERN_INFO KVM: Flushing SPT %d: 0x%llx (0x%llx) - 0x%llx\n, 
i, pte-pte.eaddr, pte-pte.vpage, pte-host_va);
+#endif
+   ppc_md.hpte_invalidate(pte-slot, pte-host_va,
+  MMU_PAGE_4K, MMU_SEGSIZE_256M,
+  false);
+   pte-host_va = 0;
+   kvm_release_pfn_dirty(pte-pfn);
+   }
+   }
+
+   /* Doing a complete flush - start from scratch */
+   if (!ea_mask)
+   vcpu-arch.hpte_cache_offset = 0;
+}
+
+void kvmppc_mmu_pte_vflush(struct kvm_vcpu *vcpu, u64 guest_vp, u64 vp_mask)
+{
+   int i;
+
+#ifdef DEBUG_MMU
+   printk(KERN_INFO KVM: Flushing %d Shadow vPTEs: 0x%llx  0x%llx\n,
+   vcpu-arch.hpte_cache_offset, guest_vp, vp_mask);
+#endif
+   BUG_ON(vcpu-arch.hpte_cache_offset  HPTEG_CACHE_NUM);
+   guest_vp = vp_mask;
+   for (i=0; ivcpu-arch.hpte_cache_offset; i++) {
+   struct hpte_cache *pte;
+
+   pte = vcpu-arch.hpte_cache[i];
+   if (!pte-host_va)
+   continue;
+
+   if ((pte-pte.vpage  vp_mask) == guest_vp) {
+#ifdef DEBUG_MMU
+   printk(KERN_INFO KVM: Flushing SPT %d: 0x%llx (0x%llx) - 0x%llx\n, 
i, pte-pte.eaddr, pte-pte.vpage, pte-host_va);
+#endif
+   ppc_md.hpte_invalidate(pte-slot, pte-host_va,
+  MMU_PAGE_4K, MMU_SEGSIZE_256M,
+  false);
+   pte-host_va = 0;
+   kvm_release_pfn_dirty(pte-pfn);
+   }
+   }
+}
+
+void kvmppc_mmu_pte_pflush(struct kvm_vcpu *vcpu, u64 pa_start, u64 pa_end)
+{
+   int i;
+
+#ifdef DEBUG_MMU
+   printk(KERN_INFO KVM: Flushing %d Shadow pPTEs: 0x%llx  0x%llx\n,
+   vcpu-arch.hpte_cache_offset, guest_pa, pa_mask);
+#endif
+   BUG_ON(vcpu-arch.hpte_cache_offset  HPTEG_CACHE_NUM);
+
+   for (i=0; ivcpu-arch.hpte_cache_offset; i++) {
+   struct hpte_cache *pte;
+
+   pte = vcpu-arch.hpte_cache[i];
+   if (!pte-host_va)
+   continue;
+
+   if ((pte-pte.raddr = pa_start)  (pte-pte.raddr  pa_end)) {
+#ifdef DEBUG_MMU
+   printk(KERN_INFO KVM: Flushing SPT %d: 

[PATCH 02/27] Pass PVR in sregs

2009-10-21 Thread Alexander Graf
Right now sregs is unused on PPC, so we can use it for initialization
of the CPU.

KVM on BookE always virtualizes the host CPU. On Book3s we go a step further
and take the PVR from userspace that tells us what kind of CPU we are supposed
to virtualize, because we support Book3s_32 and Book3s_64 guests.

In order to get that information, we use the sregs ioctl, because we don't
want to reset the guest CPU on every normal register set.

Signed-off-by: Alexander Graf ag...@suse.de

---

v4 - v5

  - make PVR 32 bits
---
 arch/powerpc/include/asm/kvm.h |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm.h b/arch/powerpc/include/asm/kvm.h
index bb2de6a..c9ca97f 100644
--- a/arch/powerpc/include/asm/kvm.h
+++ b/arch/powerpc/include/asm/kvm.h
@@ -46,6 +46,8 @@ struct kvm_regs {
 };
 
 struct kvm_sregs {
+   __u32 pvr;
+   char pad[1020];
 };
 
 struct kvm_fpu {
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 19/27] Export symbols for KVM module

2009-10-21 Thread Alexander Graf
We want to be able to build KVM as a module. To enable us doing so, we
need some more exports from core Linux parts.

This patch exports all functions and variables that are required for KVM.

Signed-off-by: Alexander Graf ag...@suse.de

---

v3 - v4:

  - don't export switch_slb
  - don't export init_context
  - don't export mm_alloc
---
 arch/powerpc/kernel/ppc_ksyms.c |3 ++-
 arch/powerpc/kernel/time.c  |1 +
 arch/powerpc/mm/hash_utils_64.c |2 ++
 3 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kernel/ppc_ksyms.c b/arch/powerpc/kernel/ppc_ksyms.c
index c8b27bb..baf778c 100644
--- a/arch/powerpc/kernel/ppc_ksyms.c
+++ b/arch/powerpc/kernel/ppc_ksyms.c
@@ -163,11 +163,12 @@ EXPORT_SYMBOL(screen_info);
 #ifdef CONFIG_PPC32
 EXPORT_SYMBOL(timer_interrupt);
 EXPORT_SYMBOL(irq_desc);
-EXPORT_SYMBOL(tb_ticks_per_jiffy);
 EXPORT_SYMBOL(cacheable_memcpy);
 EXPORT_SYMBOL(cacheable_memzero);
 #endif
 
+EXPORT_SYMBOL(tb_ticks_per_jiffy);
+
 #ifdef CONFIG_PPC32
 EXPORT_SYMBOL(switch_mmu_context);
 #endif
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 92dc844..e05f6af 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -268,6 +268,7 @@ void account_system_vtime(struct task_struct *tsk)
per_cpu(cputime_scaled_last_delta, smp_processor_id()) = deltascaled;
local_irq_restore(flags);
 }
+EXPORT_SYMBOL_GPL(account_system_vtime);
 
 /*
  * Transfer the user and system times accumulated in the paca
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 1ade7eb..2b2a4aa 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -92,6 +92,7 @@ struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT];
 struct hash_pte *htab_address;
 unsigned long htab_size_bytes;
 unsigned long htab_hash_mask;
+EXPORT_SYMBOL_GPL(htab_hash_mask);
 int mmu_linear_psize = MMU_PAGE_4K;
 int mmu_virtual_psize = MMU_PAGE_4K;
 int mmu_vmalloc_psize = MMU_PAGE_4K;
@@ -102,6 +103,7 @@ int mmu_io_psize = MMU_PAGE_4K;
 int mmu_kernel_ssize = MMU_SEGSIZE_256M;
 int mmu_highuser_ssize = MMU_SEGSIZE_256M;
 u16 mmu_slb_size = 64;
+EXPORT_SYMBOL_GPL(mmu_slb_size);
 #ifdef CONFIG_HUGETLB_PAGE
 unsigned int HPAGE_SHIFT;
 #endif
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 12/27] Add book3s_64 guest MMU

2009-10-21 Thread Alexander Graf
To be able to run a guest, we also need to implement a guest MMU.

This patch adds MMU handling for Book3s_64 guests.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_64_mmu.c |  469 ++
 1 files changed, 469 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/kvm/book3s_64_mmu.c

diff --git a/arch/powerpc/kvm/book3s_64_mmu.c b/arch/powerpc/kvm/book3s_64_mmu.c
new file mode 100644
index 000..be9c846
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_64_mmu.c
@@ -0,0 +1,469 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2009
+ *
+ * Authors: Alexander Graf ag...@suse.de
+ */
+
+#include linux/types.h
+#include linux/string.h
+#include linux/kvm.h
+#include linux/kvm_host.h
+#include linux/highmem.h
+
+#include asm/tlbflush.h
+#include asm/kvm_ppc.h
+#include asm/kvm_book3s.h
+
+// #define DEBUG_MMU
+
+static void kvmppc_mmu_book3s_64_reset_msr(struct kvm_vcpu *vcpu)
+{
+   kvmppc_set_msr(vcpu, MSR_SF);
+}
+
+static struct kvmppc_slb *kvmppc_mmu_book3s_64_find_slbe(struct 
kvmppc_vcpu_book3s *vcpu_book3s,
+  gva_t eaddr)
+{
+   int i;
+   u64 esid = GET_ESID(eaddr);
+   u64 esid_1t = GET_ESID_1T(eaddr);
+
+   for (i = 0; i  vcpu_book3s-slb_nr; i++) {
+   u64 cmp_esid = esid;
+
+   if (!vcpu_book3s-slb[i].valid)
+   continue;
+
+   if (vcpu_book3s-slb[i].large)
+   cmp_esid = esid_1t;
+
+   if (vcpu_book3s-slb[i].esid == cmp_esid)
+   return vcpu_book3s-slb[i];
+   }
+
+#ifdef DEBUG_MMU
+   printk(KERN_ERR KVM: No SLB entry found for 0x%lx [%llx | %llx]\n, 
eaddr, esid, esid_1t);
+   for (i = 0; i  vcpu_book3s-slb_nr; i++) {
+   if (vcpu_book3s-slb[i].vsid)
+   printk(KERN_ERR   %d: %c%c %llx %llx\n, i, 
vcpu_book3s-slb[i].valid ? 'v' : ' ',
+   
vcpu_book3s-slb[i].large ? 'l' : ' ',
+   
vcpu_book3s-slb[i].esid,
+   
vcpu_book3s-slb[i].vsid);
+   }
+#endif
+
+   return NULL;
+}
+
+static u64 kvmppc_mmu_book3s_64_ea_to_vp(struct kvm_vcpu *vcpu, gva_t eaddr, 
bool data)
+{
+   struct kvmppc_slb *slb = 
kvmppc_mmu_book3s_64_find_slbe(to_book3s(vcpu), eaddr);
+
+   if (!slb)
+   return 0;
+
+   if (slb-large)
+   return (((u64)eaddr  12)  0xfff) | (((u64)slb-vsid)  
28);
+
+   return (((u64)eaddr  12)  0x) | (((u64)slb-vsid)  16);
+}
+
+static int kvmppc_mmu_book3s_64_get_pagesize(struct kvmppc_slb *slbe)
+{
+   return slbe-large ? 24 : 12;
+}
+
+static u32 kvmppc_mmu_book3s_64_get_page(struct kvmppc_slb *slbe, gva_t eaddr)
+{
+   int p = kvmppc_mmu_book3s_64_get_pagesize(slbe);
+   return ((eaddr  0xfff)  p);
+}
+
+static hva_t kvmppc_mmu_book3s_64_get_pteg(struct kvmppc_vcpu_book3s 
*vcpu_book3s,
+struct kvmppc_slb *slbe, gva_t eaddr,
+bool second)
+{
+   u64 hash, pteg, htabsize;
+   u32 page;
+   hva_t r;
+
+   page = kvmppc_mmu_book3s_64_get_page(slbe, eaddr);
+   htabsize = ((1  ((vcpu_book3s-sdr1  0x1f) + 11)) - 1);
+
+   hash = slbe-vsid ^ page;
+   if (second)
+   hash = ~hash;
+   hash = ((1ULL  39ULL) - 1ULL);
+   hash = htabsize;
+   hash = 7ULL;
+
+   pteg = vcpu_book3s-sdr1  0xfffcULL;
+   pteg |= hash;
+
+#ifdef DEBUG_MMU
+   printk(KERN_INFO MMU: page=0x%x sdr1=0x%llx pteg=0x%llx 
vsid=0x%llx\n, page, vcpu_book3s-sdr1, pteg, slbe-vsid);
+#endif
+
+   r = gfn_to_hva(vcpu_book3s-vcpu.kvm, pteg  PAGE_SHIFT);
+   if (kvm_is_error_hva(r))
+   return r;
+   return r | (pteg  ~PAGE_MASK);
+}
+
+static u64 kvmppc_mmu_book3s_64_get_avpn(struct kvmppc_slb *slbe, gva_t eaddr)
+{
+   int p = kvmppc_mmu_book3s_64_get_pagesize(slbe);
+   u64 avpn;
+
+   avpn = kvmppc_mmu_book3s_64_get_page(slbe, eaddr);
+   avpn |= slbe-vsid  (28 - p);
+
+   if (p  24)
+   avpn = ((80 - p) - 56) - 8;
+   else
+   avpn = 8;
+
+   return avpn;
+}
+
+static int 

[PATCH 07/27] Add book3s_64 highmem asm code

2009-10-21 Thread Alexander Graf
This is the of entry / exit code. In order to switch between host and guest
context, we need to switch register state and call the exit code handler on
exit.

This assembly file does exactly that. To finally enter the guest it calls
into book3s_64_slb.S. On exit it gets jumped at from book3s_64_slb.S too.

Signed-off-by: Alexander Graf ag...@suse.de

---

v3 - v4:

  - header rename fix
---
 arch/powerpc/include/asm/kvm_ppc.h  |1 +
 arch/powerpc/kvm/book3s_64_interrupts.S |  392 +++
 2 files changed, 393 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/kvm/book3s_64_interrupts.S

diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index 2c6ee34..269ee46 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -39,6 +39,7 @@ enum emulation_result {
 extern int __kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu);
 extern char kvmppc_handlers_start[];
 extern unsigned long kvmppc_handler_len;
+extern void kvmppc_handler_highmem(void);
 
 extern void kvmppc_dump_vcpu(struct kvm_vcpu *vcpu);
 extern int kvmppc_handle_load(struct kvm_run *run, struct kvm_vcpu *vcpu,
diff --git a/arch/powerpc/kvm/book3s_64_interrupts.S 
b/arch/powerpc/kvm/book3s_64_interrupts.S
new file mode 100644
index 000..7b55d80
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_64_interrupts.S
@@ -0,0 +1,392 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2009
+ *
+ * Authors: Alexander Graf ag...@suse.de
+ */
+
+#include asm/ppc_asm.h
+#include asm/kvm_asm.h
+#include asm/reg.h
+#include asm/page.h
+#include asm/asm-offsets.h
+#include asm/exception-64s.h
+
+#define KVMPPC_HANDLE_EXIT .kvmppc_handle_exit
+#define ULONG_SIZE 8
+#define VCPU_GPR(n) (VCPU_GPRS + (n * ULONG_SIZE))
+
+.macro mfpaca tmp_reg, src_reg, offset, vcpu_reg
+   ld  \tmp_reg, (PACA_EXMC+\offset)(r13)
+   std \tmp_reg, VCPU_GPR(\src_reg)(\vcpu_reg)
+.endm
+
+.macro DISABLE_INTERRUPTS
+   mfmsr   r0
+   rldicl  r0,r0,48,1
+   rotldi  r0,r0,16
+   mtmsrd  r0,1
+.endm
+
+/*
+ *   *
+ * Guest entry / exit code that is in kernel module memory (highmem) *
+ *   *
+ /
+
+/* Registers:
+ *  r3: kvm_run pointer
+ *  r4: vcpu pointer
+ */
+_GLOBAL(__kvmppc_vcpu_entry)
+
+kvm_start_entry:
+   /* Write correct stack frame */
+   mflrr0
+   std r0,16(r1)
+
+   /* Save host state to the stack */
+   stdur1, -SWITCH_FRAME_SIZE(r1)
+
+   /* Save r3 (kvm_run) and r4 (vcpu) */
+   SAVE_2GPRS(3, r1)
+
+   /* Save non-volatile registers (r14 - r31) */
+   SAVE_NVGPRS(r1)
+
+   /* Save LR */
+   mflrr14
+   std r14, _LINK(r1)
+
+/* XXX optimize non-volatile loading away */
+kvm_start_lightweight:
+
+   DISABLE_INTERRUPTS
+
+   /* Save R1/R2 in the PACA */
+   std r1, PACAR1(r13)
+   std r2, (PACA_EXMC+EX_SRR0)(r13)
+   ld  r3, VCPU_HIGHMEM_HANDLER(r4)
+   std r3, PACASAVEDMSR(r13)
+
+   /* Load non-volatile guest state from the vcpu */
+   ld  r14, VCPU_GPR(r14)(r4)
+   ld  r15, VCPU_GPR(r15)(r4)
+   ld  r16, VCPU_GPR(r16)(r4)
+   ld  r17, VCPU_GPR(r17)(r4)
+   ld  r18, VCPU_GPR(r18)(r4)
+   ld  r19, VCPU_GPR(r19)(r4)
+   ld  r20, VCPU_GPR(r20)(r4)
+   ld  r21, VCPU_GPR(r21)(r4)
+   ld  r22, VCPU_GPR(r22)(r4)
+   ld  r23, VCPU_GPR(r23)(r4)
+   ld  r24, VCPU_GPR(r24)(r4)
+   ld  r25, VCPU_GPR(r25)(r4)
+   ld  r26, VCPU_GPR(r26)(r4)
+   ld  r27, VCPU_GPR(r27)(r4)
+   ld  r28, VCPU_GPR(r28)(r4)
+   ld  r29, VCPU_GPR(r29)(r4)
+   ld  r30, VCPU_GPR(r30)(r4)
+   ld  r31, VCPU_GPR(r31)(r4)
+
+   ld  r9, VCPU_PC(r4) /* r9 = vcpu-arch.pc */
+   ld  r10, VCPU_SHADOW_MSR(r4)/* r10 = vcpu-arch.shadow_msr 
*/
+
+   ld  r3, VCPU_TRAMPOLINE_ENTER(r4)
+   mtsrr0  r3
+
+   

[PATCH 08/27] Add SLB switching code for entry/exit

2009-10-21 Thread Alexander Graf
This is the really low level of guest entry/exit code.

Book3s_64 has an SLB, which stores all ESID - VSID mappings we're
currently aware of.

The segments in the guest differ from the ones on the host, so we need
to switch the SLB to tell the MMU that we're in a new context.

So we store a shadow of the guest's SLB in the PACA, switch to that on
entry and only restore bolted entries on exit, leaving the rest to the
Linux SLB fault handler.

That way we get a really clean way of switching the SLB.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_64_slb.S |  277 ++
 1 files changed, 277 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/kvm/book3s_64_slb.S

diff --git a/arch/powerpc/kvm/book3s_64_slb.S b/arch/powerpc/kvm/book3s_64_slb.S
new file mode 100644
index 000..00a8367
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_64_slb.S
@@ -0,0 +1,277 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2009
+ *
+ * Authors: Alexander Graf ag...@suse.de
+ */
+
+/**
+ **
+ *   Entry code   *
+ **
+ */
+
+.global kvmppc_handler_trampoline_enter
+kvmppc_handler_trampoline_enter:
+
+   /* Required state:
+*
+* MSR = ~IR|DR
+* R13 = PACA
+* R9 = guest IP
+* R10 = guest MSR
+* R11 = free
+* R12 = free
+* PACA[PACA_EXMC + EX_R9] = guest R9
+* PACA[PACA_EXMC + EX_R10] = guest R10
+* PACA[PACA_EXMC + EX_R11] = guest R11
+* PACA[PACA_EXMC + EX_R12] = guest R12
+* PACA[PACA_EXMC + EX_R13] = guest R13
+* PACA[PACA_EXMC + EX_CCR] = guest CR
+* PACA[PACA_EXMC + EX_R3] = guest XER
+*/
+
+   mtsrr0  r9
+   mtsrr1  r10
+
+   mtspr   SPRN_SPRG_SCRATCH0, r0
+
+   /* Remove LPAR shadow entries */
+
+#if SLB_NUM_BOLTED == 3
+
+   ld  r12, PACA_SLBSHADOWPTR(r13)
+   ld  r10, 0x10(r12)
+   ld  r11, 0x18(r12)
+   /* Invalid? Skip. */
+   rldicl. r0, r10, 37, 63
+   beq slb_entry_skip_1
+   xoris   r9, r10, slb_esi...@h
+   std r9, 0x10(r12)
+slb_entry_skip_1:
+   ld  r9, 0x20(r12)
+   /* Invalid? Skip. */
+   rldicl. r0, r9, 37, 63
+   beq slb_entry_skip_2
+   xoris   r9, r9, slb_esi...@h
+   std r9, 0x20(r12)
+slb_entry_skip_2:
+   ld  r9, 0x30(r12)
+   /* Invalid? Skip. */
+   rldicl. r0, r9, 37, 63
+   beq slb_entry_skip_3
+   xoris   r9, r9, slb_esi...@h
+   std r9, 0x30(r12)
+slb_entry_skip_3:
+   
+#else
+#error unknown number of bolted entries
+#endif
+
+   /* Flush SLB */
+
+   slbia
+
+   /* r0 = esid  ESID_MASK */
+   rldicr  r10, r10, 0, 35
+   /* r0 |= CLASS_BIT(VSID) */
+   rldic   r12, r11, 56 - 36, 36
+   or  r10, r10, r12
+   slbie   r10
+
+   isync
+
+   /* Fill SLB with our shadow */
+
+   lbz r12, PACA_KVM_SLB_MAX(r13)
+   mulli   r12, r12, 16
+   addir12, r12, PACA_KVM_SLB
+   add r12, r12, r13
+
+   /* for (r11 = kvm_slb; r11  kvm_slb + kvm_slb_size; r11+=slb_entry) */
+   li  r11, PACA_KVM_SLB
+   add r11, r11, r13
+
+slb_loop_enter:
+
+   ld  r10, 0(r11)
+
+   rldicl. r0, r10, 37, 63
+   beq slb_loop_enter_skip
+
+   ld  r9, 8(r11)
+   slbmte  r9, r10
+
+slb_loop_enter_skip:
+   addir11, r11, 16
+   cmpdcr0, r11, r12
+   blt slb_loop_enter
+
+slb_do_enter:
+
+   /* Enter guest */
+
+   mfspr   r0, SPRN_SPRG_SCRATCH0
+
+   ld  r9, (PACA_EXMC+EX_R9)(r13)
+   ld  r10, (PACA_EXMC+EX_R10)(r13)
+   ld  r12, (PACA_EXMC+EX_R12)(r13)
+
+   lwz r11, (PACA_EXMC+EX_CCR)(r13)
+   mtcrr11
+
+   ld  r11, (PACA_EXMC+EX_R3)(r13)
+   mtxer   r11
+
+   ld  r11, (PACA_EXMC+EX_R11)(r13)
+   ld  r13, (PACA_EXMC+EX_R13)(r13)
+
+   RFI
+kvmppc_handler_trampoline_enter_end:
+

[PATCH 04/27] Add Book3s fields to vcpu structs

2009-10-21 Thread Alexander Graf
We need to store more information than we currently have for vcpus
when running on Book3s.

So let's extend the internal struct definitions.

Signed-off-by: Alexander Graf ag...@suse.de

---

v3 - v4:

  - use context_id instead of mm_context

v4 - v5:

  - always include pvr in vcpu struct
---
 arch/powerpc/include/asm/kvm_host.h |   73 ++-
 1 files changed, 72 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index c9c930e..2cff5fe 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -37,6 +37,8 @@
 #define KVM_NR_PAGE_SIZES  1
 #define KVM_PAGES_PER_HPAGE(x) (1UL31)
 
+#define HPTEG_CACHE_NUM 1024
+
 struct kvm;
 struct kvm_run;
 struct kvm_vcpu;
@@ -63,6 +65,17 @@ struct kvm_vcpu_stat {
u32 dec_exits;
u32 ext_intr_exits;
u32 halt_wakeup;
+#ifdef CONFIG_PPC64
+   u32 pf_storage;
+   u32 pf_instruc;
+   u32 sp_storage;
+   u32 sp_instruc;
+   u32 queue_intr;
+   u32 ld;
+   u32 ld_slow;
+   u32 st;
+   u32 st_slow;
+#endif
 };
 
 enum kvm_exit_types {
@@ -109,9 +122,53 @@ struct kvmppc_exit_timing {
 struct kvm_arch {
 };
 
+struct kvmppc_pte {
+   u64 eaddr;
+   u64 vpage;
+   u64 raddr;
+   bool may_read;
+   bool may_write;
+   bool may_execute;
+};
+
+struct kvmppc_mmu {
+   /* book3s_64 only */
+   void (*slbmte)(struct kvm_vcpu *vcpu, u64 rb, u64 rs);
+   u64  (*slbmfee)(struct kvm_vcpu *vcpu, u64 slb_nr);
+   u64  (*slbmfev)(struct kvm_vcpu *vcpu, u64 slb_nr);
+   void (*slbie)(struct kvm_vcpu *vcpu, u64 slb_nr);
+   void (*slbia)(struct kvm_vcpu *vcpu);
+   /* book3s */
+   void (*mtsrin)(struct kvm_vcpu *vcpu, u32 srnum, ulong value);
+   u32  (*mfsrin)(struct kvm_vcpu *vcpu, u32 srnum);
+   int  (*xlate)(struct kvm_vcpu *vcpu, gva_t eaddr, struct kvmppc_pte 
*pte, bool data);
+   void (*reset_msr)(struct kvm_vcpu *vcpu);
+   void (*tlbie)(struct kvm_vcpu *vcpu, ulong addr, bool large);
+   int  (*esid_to_vsid)(struct kvm_vcpu *vcpu, u64 esid, u64 *vsid);
+   u64  (*ea_to_vp)(struct kvm_vcpu *vcpu, gva_t eaddr, bool data);
+   bool (*is_dcbz32)(struct kvm_vcpu *vcpu);
+};
+
+struct hpte_cache {
+   u64 host_va;
+   u64 pfn;
+   ulong slot;
+   struct kvmppc_pte pte;
+};
+
 struct kvm_vcpu_arch {
-   u32 host_stack;
+   ulong host_stack;
u32 host_pid;
+#ifdef CONFIG_PPC64
+   ulong host_msr;
+   ulong host_r2;
+   void *host_retip;
+   ulong trampoline_lowmem;
+   ulong trampoline_enter;
+   ulong highmem_handler;
+   ulong host_paca_phys;
+   struct kvmppc_mmu mmu;
+#endif
 
u64 fpr[32];
ulong gpr[32];
@@ -123,6 +180,10 @@ struct kvm_vcpu_arch {
ulong xer;
 
ulong msr;
+#ifdef CONFIG_PPC64
+   ulong shadow_msr;
+   ulong hflags;
+#endif
u32 mmucr;
ulong sprg0;
ulong sprg1;
@@ -149,6 +210,7 @@ struct kvm_vcpu_arch {
u32 ivor[64];
ulong ivpr;
u32 pir;
+   u32 pvr;
 
u32 shadow_pid;
u32 pid;
@@ -174,6 +236,9 @@ struct kvm_vcpu_arch {
 #endif
 
u32 last_inst;
+#ifdef CONFIG_PPC64
+   ulong fault_dsisr;
+#endif
ulong fault_dear;
ulong fault_esr;
gpa_t paddr_accessed;
@@ -186,7 +251,13 @@ struct kvm_vcpu_arch {
u32 cpr0_cfgaddr; /* holds the last set cpr0_cfgaddr */
 
struct timer_list dec_timer;
+   u64 dec_jiffies;
unsigned long pending_exceptions;
+
+#ifdef CONFIG_PPC64
+   struct hpte_cache hpte_cache[HPTEG_CACHE_NUM];
+   int hpte_cache_offset;
+#endif
 };
 
 #endif /* __POWERPC_KVM_HOST_H__ */
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 05/27] Add asm/kvm_book3s.h

2009-10-21 Thread Alexander Graf
This adds the book3s specific header file that contains structs that
are only valid on book3s specific code.

Signed-off-by: Alexander Graf ag...@suse.de

---

v3 - v4:

  - use context_id instead of mm_alloc
---
 arch/powerpc/include/asm/kvm_book3s.h |  136 +
 1 files changed, 136 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/include/asm/kvm_book3s.h

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
new file mode 100644
index 000..c601133
--- /dev/null
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -0,0 +1,136 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2009
+ *
+ * Authors: Alexander Graf ag...@suse.de
+ */
+
+#ifndef __ASM_KVM_BOOK3S_H__
+#define __ASM_KVM_BOOK3S_H__
+
+#include linux/types.h
+#include linux/kvm_host.h
+#include asm/kvm_ppc.h
+
+struct kvmppc_slb {
+   u64 esid;
+   u64 vsid;
+   u64 orige;
+   u64 origv;
+   bool valid;
+   bool Ks;
+   bool Kp;
+   bool nx;
+   bool large;
+   bool class;
+};
+
+struct kvmppc_sr {
+   u32 raw;
+   u32 vsid;
+   bool Ks;
+   bool Kp;
+   bool nx;
+};
+
+struct kvmppc_bat {
+   u32 bepi;
+   u32 bepi_mask;
+   bool vs;
+   bool vp;
+   u32 brpn;
+   u8 wimg;
+   u8 pp;
+};
+
+struct kvmppc_sid_map {
+   u64 guest_vsid;
+   u64 guest_esid;
+   u64 host_vsid;
+   bool valid;
+};
+
+#define SID_MAP_BITS9
+#define SID_MAP_NUM (1  SID_MAP_BITS)
+#define SID_MAP_MASK(SID_MAP_NUM - 1)
+
+struct kvmppc_vcpu_book3s {
+   struct kvm_vcpu vcpu;
+   struct kvmppc_sid_map sid_map[SID_MAP_NUM];
+   struct kvmppc_slb slb[64];
+   struct {
+   u64 esid;
+   u64 vsid;
+   } slb_shadow[64];
+   u8 slb_shadow_max;
+   struct kvmppc_sr sr[16];
+   struct kvmppc_bat ibat[8];
+   struct kvmppc_bat dbat[8];
+   u64 hid[6];
+   int slb_nr;
+   u64 sdr1;
+   u64 dsisr;
+   u64 hior;
+   u64 msr_mask;
+   u64 vsid_first;
+   u64 vsid_next;
+   u64 vsid_max;
+   int context_id;
+};
+
+#define CONTEXT_HOST   0
+#define CONTEXT_GUEST  1
+#define CONTEXT_GUEST_END  2
+
+#define VSID_REAL  0xfff0
+#define VSID_REAL_DR   0xffe0
+#define VSID_REAL_IR   0xffd0
+#define VSID_BAT   0xffc0
+#define VSID_PR0x8000
+
+extern void kvmppc_mmu_pte_flush(struct kvm_vcpu *vcpu, u64 ea, u64 ea_mask);
+extern void kvmppc_mmu_pte_vflush(struct kvm_vcpu *vcpu, u64 vp, u64 vp_mask);
+extern void kvmppc_mmu_pte_pflush(struct kvm_vcpu *vcpu, u64 pa_start, u64 
pa_end);
+extern void kvmppc_set_msr(struct kvm_vcpu *vcpu, u64 new_msr);
+extern void kvmppc_mmu_book3s_64_init(struct kvm_vcpu *vcpu);
+extern void kvmppc_mmu_book3s_32_init(struct kvm_vcpu *vcpu);
+extern int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, struct kvmppc_pte *pte);
+extern int kvmppc_mmu_map_segment(struct kvm_vcpu *vcpu, ulong eaddr);
+extern void kvmppc_mmu_flush_segments(struct kvm_vcpu *vcpu);
+extern struct kvmppc_pte *kvmppc_mmu_find_pte(struct kvm_vcpu *vcpu, u64 ea, 
bool data);
+extern int kvmppc_ld(struct kvm_vcpu *vcpu, ulong eaddr, int size, void *ptr, 
bool data);
+extern int kvmppc_st(struct kvm_vcpu *vcpu, ulong eaddr, int size, void *ptr);
+extern void kvmppc_book3s_queue_irqprio(struct kvm_vcpu *vcpu, unsigned int 
vec);
+
+extern u32 kvmppc_trampoline_lowmem;
+extern u32 kvmppc_trampoline_enter;
+
+static inline struct kvmppc_vcpu_book3s *to_book3s(struct kvm_vcpu *vcpu)
+{
+   return container_of(vcpu, struct kvmppc_vcpu_book3s, vcpu);
+}
+
+static inline ulong dsisr(void)
+{
+   ulong r;
+   asm ( mfdsisr %0  : =r (r) );
+   return r;
+}
+
+extern void kvm_return_point(void);
+
+#define INS_DCBZ   0x7c0007ec
+
+#endif /* __ASM_KVM_BOOK3S_H__ */
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/27] Add KVM support for Book3s_64 (PPC64) hosts v5

2009-10-21 Thread Alexander Graf


On 21.10.2009, at 17:03, Alexander Graf wrote:


KVM for PowerPC only supports embedded cores at the moment.

While it makes sense to virtualize on small machines, it's even more  
fun

to do so on big boxes. So I figured we need KVM for PowerPC64 as well.

This patchset implements KVM support for Book3s_64 hosts and guest  
support

for Book3s_64 and G3/G4.

To really make use of this, you also need a recent version of qemu.


Don't want to apply patches? Get the git tree!

$ git clone git://csgraf.de/kvm
$ git checkout origin/ppc-v4


ppc-v5 of course. Though I'm still trying to take git to actually  
serve the correct tree - sigh.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [KVM PATCH 1/2] KVM: Directly inject interrupts via irqfd

2009-10-21 Thread Gleb Natapov
On Wed, Oct 21, 2009 at 10:34:53AM -0400, Gregory Haskins wrote:
 IRQFD currently uses a deferred workqueue item to execute the injection
 operation.  It was originally designed this way because kvm_set_irq()
 required the caller to hold the irq_lock mutex, and the eventfd callback
 is invoked from within a non-preemptible critical section.
 
 With the advent of lockless injection support in kvm_set_irq, the deferment
 mechanism is no longer technically needed. Since context switching to the
 workqueue is a source of interrupt latency, lets switch to a direct
 method.
 
kvm_set_irq is fully lockless only in MSI case. IOAPIC/PIC has mutexes.

 Signed-off-by: Gregory Haskins ghask...@novell.com
 ---
 
  virt/kvm/eventfd.c |   15 +++
  1 files changed, 3 insertions(+), 12 deletions(-)
 
 diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
 index 30f70fd..1a529d4 100644
 --- a/virt/kvm/eventfd.c
 +++ b/virt/kvm/eventfd.c
 @@ -49,16 +49,14 @@ struct _irqfd {
   poll_tablept;
   wait_queue_head_t*wqh;
   wait_queue_t  wait;
 - struct work_structinject;
   struct work_structshutdown;
  };
  
  static struct workqueue_struct *irqfd_cleanup_wq;
  
  static void
 -irqfd_inject(struct work_struct *work)
 +irqfd_inject(struct _irqfd *irqfd)
  {
 - struct _irqfd *irqfd = container_of(work, struct _irqfd, inject);
   struct kvm *kvm = irqfd-kvm;
  
   kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID, irqfd-gsi, 1);
 @@ -80,12 +78,6 @@ irqfd_shutdown(struct work_struct *work)
   remove_wait_queue(irqfd-wqh, irqfd-wait);
  
   /*
 -  * We know no new events will be scheduled at this point, so block
 -  * until all previously outstanding events have completed
 -  */
 - flush_work(irqfd-inject);
 -
 - /*
* It is now safe to release the object's resources
*/
   eventfd_ctx_put(irqfd-eventfd);
 @@ -126,7 +118,7 @@ irqfd_wakeup(wait_queue_t *wait, unsigned mode, int sync, 
 void *key)
  
   if (flags  POLLIN)
   /* An event has been signaled, inject an interrupt */
 - schedule_work(irqfd-inject);
 + irqfd_inject(irqfd);
  
   if (flags  POLLHUP) {
   /* The eventfd is closing, detach from KVM */
 @@ -179,7 +171,6 @@ kvm_irqfd_assign(struct kvm *kvm, int fd, int gsi)
   irqfd-kvm = kvm;
   irqfd-gsi = gsi;
   INIT_LIST_HEAD(irqfd-list);
 - INIT_WORK(irqfd-inject, irqfd_inject);
   INIT_WORK(irqfd-shutdown, irqfd_shutdown);
  
   file = eventfd_fget(fd);
 @@ -214,7 +205,7 @@ kvm_irqfd_assign(struct kvm *kvm, int fd, int gsi)
* before we registered, and trigger it as if we didn't miss it.
*/
   if (events  POLLIN)
 - schedule_work(irqfd-inject);
 + irqfd_inject(irqfd);
  
   /*
* do not drop the file until the irqfd is fully initialized, otherwise
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [KVM PATCH 1/2] KVM: Directly inject interrupts via irqfd

2009-10-21 Thread Gregory Haskins
Gleb Natapov wrote:
 On Wed, Oct 21, 2009 at 10:34:53AM -0400, Gregory Haskins wrote:
 IRQFD currently uses a deferred workqueue item to execute the injection
 operation.  It was originally designed this way because kvm_set_irq()
 required the caller to hold the irq_lock mutex, and the eventfd callback
 is invoked from within a non-preemptible critical section.

 With the advent of lockless injection support in kvm_set_irq, the deferment
 mechanism is no longer technically needed. Since context switching to the
 workqueue is a source of interrupt latency, lets switch to a direct
 method.

 kvm_set_irq is fully lockless only in MSI case. IOAPIC/PIC has mutexes.

Right, but irqfd by design only works with MSI (or MSI like edge
triggers) anyway.  Legacy-type injections follow a different path.

In any case, I didn't change the locking (you did ;).  You recently
patched the irqfd code to remove the irq_lock, but we still had the
deferment mechanism in place to avoid the mutex_lock from within the
POLLIN callback.  Since the mutex_lock is now no longer acquired in this
path, the deferment technique is not needed either.  Its only adding
overhead for no purpose.  So I am simply cleaning that up to improve
interrupt performance.

HTH,
-Greg




signature.asc
Description: OpenPGP digital signature


Re: [KVM PATCH 1/2] KVM: Directly inject interrupts via irqfd

2009-10-21 Thread Gleb Natapov
On Wed, Oct 21, 2009 at 11:34:51AM -0400, Gregory Haskins wrote:
 Gleb Natapov wrote:
  On Wed, Oct 21, 2009 at 10:34:53AM -0400, Gregory Haskins wrote:
  IRQFD currently uses a deferred workqueue item to execute the injection
  operation.  It was originally designed this way because kvm_set_irq()
  required the caller to hold the irq_lock mutex, and the eventfd callback
  is invoked from within a non-preemptible critical section.
 
  With the advent of lockless injection support in kvm_set_irq, the deferment
  mechanism is no longer technically needed. Since context switching to the
  workqueue is a source of interrupt latency, lets switch to a direct
  method.
 
  kvm_set_irq is fully lockless only in MSI case. IOAPIC/PIC has mutexes.
 
 Right, but irqfd by design only works with MSI (or MSI like edge
 triggers) anyway.  Legacy-type injections follow a different path.
 
Ah, If this the case and it will stay that way then the change looks OK
to me.

 In any case, I didn't change the locking (you did ;).  You recently
 patched the irqfd code to remove the irq_lock, but we still had the
 deferment mechanism in place to avoid the mutex_lock from within the
 POLLIN callback.  Since the mutex_lock is now no longer acquired in this
 path, the deferment technique is not needed either.  Its only adding
 overhead for no purpose.  So I am simply cleaning that up to improve
 interrupt performance.
 
 HTH,
 -Greg
 
 



--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [KVM PATCH 1/2] KVM: Directly inject interrupts via irqfd

2009-10-21 Thread Gregory Haskins
Gleb Natapov wrote:
 On Wed, Oct 21, 2009 at 11:34:51AM -0400, Gregory Haskins wrote:
 Gleb Natapov wrote:
 On Wed, Oct 21, 2009 at 10:34:53AM -0400, Gregory Haskins wrote:
 IRQFD currently uses a deferred workqueue item to execute the injection
 operation.  It was originally designed this way because kvm_set_irq()
 required the caller to hold the irq_lock mutex, and the eventfd callback
 is invoked from within a non-preemptible critical section.

 With the advent of lockless injection support in kvm_set_irq, the deferment
 mechanism is no longer technically needed. Since context switching to the
 workqueue is a source of interrupt latency, lets switch to a direct
 method.

 kvm_set_irq is fully lockless only in MSI case. IOAPIC/PIC has mutexes.
 Right, but irqfd by design only works with MSI (or MSI like edge
 triggers) anyway.  Legacy-type injections follow a different path.

 Ah, If this the case and it will stay that way then the change looks OK
 to me.


I believe Avi, Michael, et. al. were in agreement with me on that design
choice.  I believe the reason is that there is no good way to do EOI/ACK
feedback within the constraints of an eventfd pipe which would be
required for the legacy pin-type interrupts.  Therefore, we won't even
bother trying.  High-performance subsystems will use irqfd/msi, and
legacy emulation can use the existing injection code (which includes the
necessary feedback for ack/eoi).

To that point, perhaps it should be better documented.  If we need a v2,
I will add a comment.

Kind Regards,
-Greg



signature.asc
Description: OpenPGP digital signature


Re: kvm88 compile errors with 2.6.31.1

2009-10-21 Thread Jorge Lucángeli Obes
2009/10/21 Michael Tokarev m...@tls.msk.ru:
 Jorge Lucángeli Obes wrote:
 []

 See this thread:

 http://www.mail-archive.com/kvm@vger.kernel.org/msg22775.html

 I believe the patches have already been applied, but there have not
 been any releases since then.

 qemu-kvm-0.11.0 is out for a long time.

It's true, I was actually refering to the kvm-XX snapshots. Does
qemu-kvm-0.11.0 include Jan's patches? I figured they would be on the
kvm-kmod side.

Cheers,
Jorge
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Autotest] [PATCH] Test 802.1Q vlan of nic

2009-10-21 Thread Dor Laor

On 10/21/2009 03:46 PM, Uri Lublin wrote:

On 10/21/2009 12:37 PM, Amos Kong wrote:

On Tue, Oct 20, 2009 at 09:19:50AM -0400, Michael Goldish wrote:

- Dor Laordl...@redhat.com  wrote:

On 10/15/2009 11:48 AM, Amos Kong wrote:

For the sake of safety maybe we should start both VMs with -snapshot.
Dor, what do you think?  Is it safe to start 2 VMs with the same disk
image
when only one of them uses -snapshot?


Setup the second VM with -snapshot is enough. The image can only be
R/W by 1th VM.



Actually, I agree with Michael. If both VMs use the same disk image, it
is safer to setup both VMs with -snapshot. When the first VM writes to
the disk-image the second VM may be affected.


That's a must. If only one VM uses -snapshot, its base will get written 
and the snapshot will get obsolete.


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


kvm-88 build error on CentOS 5.3 x86_64, kernel 2.6.18-164.2.1.el5.plus

2009-10-21 Thread Jon .
Hi all,

I am attempting to build kvm-88 on CentOS 5.3 and encounter the error
below while running make. Is the kernel included with CentOS too old?
Has any one been able to successfully build and run kvm-88 on CentOS
5.3?

Thanks!

  CC [M]  /home/jon/src/kvm-88/kvm/
kernel/x86/svm.o
In file included from command line:1:
/home/jon/src/kvm-88/kvm/kernel/x86/external-module-compat.h:12:
error: redefinition of typedef \u2018phys_addr_t\u2019
include/asm/types.h:50: error: previous declaration of
\u2018phys_addr_t\u2019 was here
In file included from
/home/jon/src/kvm-88/kvm/kernel/x86/external-module-compat.h:16,
 from command line:1:
/home/jon/src/kvm-88/kvm/kernel/x86/../external-module-compat-comm.h:609:
error: static declaration of \u2018get_user_pages_fast\u2019 follows
non-static declaration
include/linux/mm.h:873: error: previous declaration of
\u2018get_user_pages_fast\u2019 was here
make[4]: *** [/home/jon/src/kvm-88/kvm/kernel/x86/svm.o] Error 1
make[3]: *** [/home/jon/src/kvm-88/kvm/kernel/x86] Error 2
make[2]: *** [_module_/home/jon/src/kvm-88/kvm/kernel] Error 2
make[1]: *** [all] Error 2
make: *** [kvm-kmod] Error 2
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: vhost-net patches

2009-10-21 Thread Michael S. Tsirkin
On Wed, Oct 21, 2009 at 12:59:50PM -0700, Shirley Ma wrote:
 Hello Micahel,
 
 I have set up guest kernel 2.6.32-rc5 with MSI configured. Here are errors 
 what
 I have got:
 
 1. First, qemu complained extboot.bin not found, I copied the file from
 optionrom/ dir to pc-bios/ dir, this problem is gone.
 
 2. Second, when guest boot up, it has lots of errors as below. Without vhost
 support, I still saw same errors but the guest interface can communicate with
 host, but with vhost, it doesn't work.

There was a ecent bugfix in qemu-kvm I pushed.
Could you please verify that you have cec75e39151e49cc90c849eab5d0d729667c9e68 ?


 I am posting the errors from /var/log/
 messages here:
 
 virtio-pci :00:03.0: can't find IRQ for PCI INT A; probably buggy MP table
 virtio-pci :00:04.0: can't find IRQ for PCI INT A; probably buggy MP table
 virtio-pci :00:04.0: irq 24 for MSI/MSI-X
 virtio-pci :00:04.0: irq 25 for MSI/MSI-X
 IRQ handler type mismatch for IRQ 1
 current handler: i8042
 Pid: 335, comm: modprobe Not tainted 2.6.32-rc5 #3
 Call Trace:
 __setup_irq+0x24c/0x2ac
 request_threaded_irq+0x113/0x179
 ? vring_interrupt+0x0/0x2f
 vp_try_to_find_vqs+0x4a3/0x4e0 [virtio_pci]
 ? blk_done+0x0/0xa7 [virtio_blk]
 vp_find_vqs+0x1b/0x62 [virtio_pci]
 virtblk_probe+0xbd/0x3d0 [virtio_blk]
 ? sysfs_do_create_link+0xbb/0xfd
 ? blk_done+0x0/0xa7 [virtio_blk]
 ? add_status+0x1f/0x24
 virtio_dev_probe+0x91/0xb0
 driver_probe_device+0x79/0x105
 __driver_attach+0x43/0x5f
 bus_for_each_dev+0x3d/0x67
 driver_attach+0x14/0x16
 ? __driver_attach+0x0/0x5f
 bus_add_driver+0xa2/0x1c9
 driver_register+0x8b/0xeb
 ? init+0x0/0x24 [virtio_blk]
 register_virtio_driver+0x1f/0x22
 init+0x22/0x24 [virtio_blk]
 do_one_initcall+0x4c/0x13a
 sys_init_module+0xa7/0x1db
 syscall_call+0x7/0xb
 virtio-pci :00:04.0: irq 24 for MSI/MSI-X
 virtio-pci :00:04.0: irq 25 for MSI/MSI-X

This was recently reported without vhost, did not
reproduce it here yet.
And you say you do not see the above without vhost?

 vda: vda1 vda2
 EXT3-fs: INFO: recovery required on readonly filesystem.
 EXT3-fs: write access will be enabled during recovery.
 kjournald starting. Commit interval 5 seconds
 EXT3-fs: recovery complete.
 EXT3-fs: mounted filesystem with writeback data mode.
 udevd version 127 started
 virtio-pci :00:03.0: irq 26 for MSI/MSI-X
 virtio-pci :00:03.0: irq 27 for MSI/MSI-X
 virtio-pci :00:03.0: irq 28 for MSI/MSI-X
 IRQ handler type mismatch for IRQ 1
 current handler: i8042
 Pid: 440, comm: modprobe Not tainted 2.6.32-rc5 #3
 Call Trace:
 __setup_irq+0x24c/0x2ac
 request_threaded_irq+0x113/0x179
 ? vring_interrupt+0x0/0x2f
 vp_try_to_find_vqs+0x4a3/0x4e0 [virtio_pci]
 ? skb_recv_done+0x0/0x36 [virtio_net]
 vp_find_vqs+0x1b/0x62 [virtio_pci]
 virtnet_probe+0x265/0x347 [virtio_net]
 ? skb_recv_done+0x0/0x36 [virtio_net]
 ? skb_xmit_done+0x0/0x1e [virtio_net]
 ? add_status+0x1f/0x24
 virtio_dev_probe+0x91/0xb0
 driver_probe_device+0x79/0x105
 __driver_attach+0x43/0x5f
 bus_for_each_dev+0x3d/0x67
 driver_attach+0x14/0x16
 ? __driver_attach+0x0/0x5f
 bus_add_driver+0xa2/0x1c9
 driver_register+0x8b/0xeb
 ? init+0x0/0xf [virtio_net]
 register_virtio_driver+0x1f/0x22
 init+0xd/0xf [virtio_net]
 do_one_initcall+0x4c/0x13a
 sys_init_module+0xa7/0x1db
 syscall_call+0x7/0xb
 virtio-pci :00:03.0: irq 26 for MSI/MSI-X
 virtio-pci :00:03.0: irq 27 for MSI/MSI-X
 
 3. The guest interface is up, and cat /proc/interrupts outputs:
 
 24: 0 PCI-MSI-edge virtio1-config
 25: 2571 PCI-MSI-edge virtio1-virtqueues
 26: 0 PCI-MSI-edge virtio0-config
 27: 0 PCI-MSI-edge virtio0-virtqueues
 
 Thanks
 Shirley
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Q: Stopped VM still using host cpu CPU ?

2009-10-21 Thread Daniel Schwager
Hi Avi,

so, setup with
opcontrol --deinit; modprobe oprofile timer=1; opcontrol
--start
   
 
   Use 'opreport -l'.  Make sure your qemu isn't stripped.

All VM's are in paused state:

top - 22:08:15 up 2 days, 12:18,  8 users,  load average: 0.12, 0.19,
0.14
Tasks: 185 total,   1 running, 182 sleeping,   2 stopped,   0 zombie
Cpu(s):  0.5%us,  1.7%sy,  0.0%ni, 97.8%id,  0.0%wa,  0.0%hi,  0.0%si,
0.0%st
Mem:   8196468k total,  4166468k used,  403k free,76044k buffers
Swap: 39780320k total,70232k used, 39710088k free,   475876k cached

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND

 9587 root  20   0 1145m 1.0g 1944 S  6.6 13.1   0:28.78
qemu-system-x86

 9525 root  20   0 1143m 1.0g 1900 S  6.3 13.1   0:27.26
qemu-system-x86

 9305 root  20   0 1143m 1.0g 1900 S  5.0 13.1   0:22.86
qemu-system-x86


 opreport -l  --symbols | less
CPU: CPU with timer interrupt, speed 0 MHz (estimated)
Profiling through timer interrupt
samples  %image name   app name
symbol name
418814   98.5250  no-vmlinux   no-vmlinux   (no
symbols)
1228  0.2889  qemu-system-x86_64   qemu-system-x86_64
main_loop_wait
888   0.2089  libpthread-2.8.solibpthread-2.8.so
__read_nocancel
...

 opreport -l
/opt/kvm-86-vnc-patch.oprofile/usr/bin/qemu-system-x86_64 --symbols |
less

CPU: CPU with timer interrupt, speed 0 MHz (estimated)
Profiling through timer interrupt
samples  %image name   symbol name
1036 52.2441  qemu-system-x86_64   main_loop_wait
155   7.8164  qemu-system-x86_64   dynticks_rearm_timer
854.2864  [vdso] (tgid:9525 range:0x7fff565fe000-0x7fff565ff000)
(no symbols)
773.8830  qemu-system-x86_64   qemu_get_clock
633.1770  qemu-system-x86_64   .plt
592.9753  qemu-system-x86_64   host_alarm_handler
572.8744  qemu-system-x86_64   qemu_shutdown_requested
512.5719  [vdso] (tgid:9305 range:0x7fff3d9ff000-0x7fff3da0)
(no symbols)
492.4710  qemu-system-x86_64   sigfd_handler
472.3701  qemu-system-x86_64   tap_can_send
412.0676  [vdso] (tgid:9587 range:0x7abfe000-0x7abff000)
(no symbols)
381.9163  qemu-system-x86_64   kvm_mutex_lock
361.8154  qemu-system-x86_64   kvm_main_loop
311.5633  qemu-system-x86_64   get_clock
291.4624  qemu-system-x86_64   qemu_bh_poll
241.2103  qemu-system-x86_64   io_thread_wakeup
231.1599  qemu-system-x86_64   qemu_event_read
190.9581  qemu-system-x86_64   qemu_powerdown_requested
180.9077  qemu-system-x86_64   kvm_mutex_unlock
120.6051  qemu-system-x86_64   qemu_kvm_notify_work
120.6051  qemu-system-x86_64   qemu_reset_requested
110.5547  qemu-system-x86_64   e1000_can_receive
6 0.3026  qemu-system-x86_64   slirp_is_inited
3 0.1513  qemu-system-x86_64   qemu_notify_event
1 0.0504  qemu-system-x86_64   qemu_set_fd_handler2


I do not understand, why only ~1036 kvm-samples produces a load of (in
sum) 18% ...

I'm not able to convert vmlinuz to vmlinux (for profiling the kernel
...) on my old FC9 - sorry.
Because I cannot update my old fc9, I will move to CentOS later - and
hopefully
get the kernel profiling running.

May these oprofile-information will help you to track down the problem -
if not,
I try to came back soon with kernel-profiling information on CentOS.

regards
Danny


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kvm88 compile errors with 2.6.31.1

2009-10-21 Thread Michael Tokarev

Jorge Lucángeli Obes wrote:

2009/10/21 Michael Tokarev m...@tls.msk.ru:

Jorge Lucángeli Obes wrote:
[]

See this thread:

http://www.mail-archive.com/kvm@vger.kernel.org/msg22775.html

I believe the patches have already been applied, but there have not
been any releases since then.

qemu-kvm-0.11.0 is out for a long time.


It's true, I was actually refering to the kvm-XX snapshots. Does
qemu-kvm-0.11.0 include Jan's patches? I figured they would be on the
kvm-kmod side.


2.6.31 kernel includes more recent bits than kvm-kmod-88.. ;)

/mjt
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] KVM test: Add new utility functions to kvm_utils

2009-10-21 Thread Lucas Meneghel Rodrigues
Some distributors ship CD and DVD files with SHA1 hash sums instead
of MD5 hash sums, so let's extend the kvm_utils functions to
evaluate and compare SHA1 hashes:

* sha1sum_file(): Calculate SHA1 sum for file
* unmap_url_cache(): Reimplementation of a function present on
autotest utils that downloads a file and caches it. The reason
I'm keeping it is that I want more testing before I move all
needed function definitions to the autotest API
* get_hash_from_file(): Extract hash string from a file containing
hashes

Signed-off-by: Lucas Meneghel Rodrigues l...@redhat.com
---
 client/tests/kvm/kvm_utils.py |  106 -
 1 files changed, 104 insertions(+), 2 deletions(-)

diff --git a/client/tests/kvm/kvm_utils.py b/client/tests/kvm/kvm_utils.py
index 53b664a..f1a6b4b 100644
--- a/client/tests/kvm/kvm_utils.py
+++ b/client/tests/kvm/kvm_utils.py
@@ -4,8 +4,8 @@ KVM test utility functions.
 @copyright: 2008-2009 Red Hat Inc.
 
 
-import md5, thread, subprocess, time, string, random, socket, os, signal, pty
-import select, re, logging, commands, cPickle
+import md5, sha, thread, subprocess, time, string, random, socket, os, signal
+import select, re, logging, commands, cPickle, pty
 from autotest_lib.client.bin import utils
 from autotest_lib.client.common_lib import error
 import kvm_subprocess
@@ -788,3 +788,105 @@ def md5sum_file(filename, size=None):
 size -= len(data)
 f.close()
 return o.hexdigest()
+
+
+def sha1sum_file(filename, size=None):
+
+Calculate the sha1sum of filename.
+If size is not None, limit to first size bytes.
+Throw exception if something is wrong with filename.
+Can be also implemented with bash one-liner (assuming size%1024==0):
+dd if=filename bs=1024 count=size/1024 | sha1sum -
+
+@param filename: Path of the file that will have its sha1sum calculated.
+@param returns: sha1sum of the file.
+
+chunksize = 4096
+fsize = os.path.getsize(filename)
+if not size or sizefsize:
+size = fsize
+f = open(filename, 'rb')
+o = sha.new()
+while size  0:
+if chunksize  size:
+chunksize = size
+data = f.read(chunksize)
+if len(data) == 0:
+logging.debug(Nothing left to read but size=%d % size)
+break
+o.update(data)
+size -= len(data)
+f.close()
+return o.hexdigest()
+
+
+def unmap_url_cache(cachedir, url, expected_hash, method=md5):
+
+Downloads a file from a URL to a cache directory. If the file is already
+at the expected position and has the expected hash, let's not download it
+again.
+
+@param cachedir: Directory that might hold a copy of the file we want to
+download.
+@param url: URL for the file we want to download.
+@param expected_hash: Hash string that we expect the file downloaded to
+have.
+@param method: Method used to calculate the hash string (md5, sha1).
+
+# Let's convert cachedir to a canonical path, if it's not already
+cachedir = os.path.realpath(cachedir)
+if not os.path.isdir(cachedir):
+try:
+os.makedirs(cachedir)
+except:
+raise ValueError('Could not create cache directory %s' % cachedir)
+file_from_url = os.path.basename(url)
+file_local_path = os.path.join(cachedir, file_from_url)
+
+file_hash = None
+failure_counter = 0
+while not file_hash == expected_hash:
+if os.path.isfile(file_local_path):
+if method == md5:
+file_hash = md5sum_file(file_local_path)
+elif method == sha1:
+file_hash = sha1sum_file(file_local_path)
+
+if file_hash == expected_hash:
+# File is already at the expected position and ready to go
+src = file_from_url
+else:
+# Let's download the package again, it's corrupted...
+logging.error(Seems that file %s is corrupted, trying to 
+  download it again % file_from_url)
+src = url
+failure_counter += 1
+else:
+# File is not there, let's download it
+src = url
+if failure_counter  1:
+raise EnvironmentError(Consistently failed to download the 
+   package %s. Aborting further download 
+   attempts. This might mean either the 
+   network connection has problems or the 
+   expected hash string that was determined 
+   for this file is wrong % file_from_url)
+file_path = utils.unmap_url(cachedir, src, cachedir)
+
+return file_path
+
+
+def get_hash_from_file(sha_path, dvd_basename):
+
+Get the a hash from a given DVD image from a hash file
+(Hash files are usually named MD5SUM or 

[PATCH 2/3] KVM test: Daily DVD test control file

2009-10-21 Thread Lucas Meneghel Rodrigues
This control file is a proof of concept of a control file
that downloads a iso DVD image produced on a daily basis
to the isos directory, gets the hash sums for the DVD
images and verifies the integrity of the download.

Makes use of the utility functions introduced on
previous patches.

Signed-off-by: Lucas Meneghel Rodrigues l...@redhat.com
---
 client/tests/kvm/control.daily_dvd |  249 
 1 files changed, 249 insertions(+), 0 deletions(-)
 create mode 100644 client/tests/kvm/control.daily_dvd

diff --git a/client/tests/kvm/control.daily_dvd 
b/client/tests/kvm/control.daily_dvd
new file mode 100644
index 000..07dade6
--- /dev/null
+++ b/client/tests/kvm/control.daily_dvd
@@ -0,0 +1,249 @@
+AUTHOR = 
+u...@redhat.com (Uri Lublin)
+dru...@redhat.com (Dror Russo)
+mgold...@redhat.com (Michael Goldish)
+dh...@redhat.com (David Huff)
+aerom...@redhat.com (Alexey Eromenko)
+mbu...@redhat.com (Mike Burns)
+
+TIME = 'SHORT'
+NAME = 'KVM test'
+TEST_TYPE = 'client'
+TEST_CLASS = 'Virtualization'
+TEST_CATEGORY = 'Functional'
+
+DOC = 
+Executes the KVM test framework on a given host. This module is separated in
+minor functions, that execute different tests for doing Quality Assurance on
+KVM (both kernelspace and userspace) code.
+
+For online docs, please refer to http://www.linux-kvm.org/page/KVM-Autotest
+
+This control file tests daily DVD/CDs for a given distro, adding MD5/SHA1
+checksums and executing unattended installs of that distro.
+
+
+
+import sys, os
+
+#-
+# set English environment (command output might be localized, need to be safe)
+#-
+os.environ['LANG'] = 'en_US.UTF-8'
+
+#-
+# Enable modules import from current directory (tests/kvm)
+#-
+pwd = os.path.join(os.environ['AUTODIR'],'tests/kvm')
+sys.path.append(pwd)
+
+# 
+# create required symlinks
+# 
+# When dispatching tests from autotest-server the links we need do not exist on
+# the host (the client). The following lines create those symlinks. Change
+# 'rootdir' here and/or mount appropriate directories in it.
+#
+# When dispatching tests on local host (client mode) one can either setup kvm
+# links, or same as server mode use rootdir and set all appropriate links and
+# mount-points there. For example, guest installation tests need to know where
+# to find the iso-files.
+#
+# We create the links only if not already exist, so if one already set up the
+# links for client/local run we do not touch the links.
+rootdir='/tmp/kvm_autotest_root'
+iso=os.path.join(rootdir, 'iso')
+images=os.path.join(rootdir, 'images')
+qemu=os.path.join(rootdir, 'qemu')
+qemu_img=os.path.join(rootdir, 'qemu-img')
+
+
+def link_if_not_exist(ldir, target, link_name):
+t = target
+l = os.path.join(ldir, link_name)
+if not os.path.exists(l):
+os.system('ln -s %s %s' % (t, l))
+
+# Create links only if not already exist
+link_if_not_exist(pwd, '../../', 'autotest')
+link_if_not_exist(pwd, iso, 'isos')
+link_if_not_exist(pwd, images, 'images')
+link_if_not_exist(pwd, qemu, 'qemu')
+link_if_not_exist(pwd, qemu_img, 'qemu-img')
+
+# 
+# Params that will be passed to the KVM install/build test
+# 
+params = {
+name: build,
+shortname: build,
+type: build,
+mode: release,
+#mode: snapshot,
+#mode: localtar,
+#mode: localsrc,
+#mode: git,
+#mode: noinstall,
+#mode: koji,
+
+## Are we going to load modules built by this test?
+## Defaults to 'yes', so if you are going to provide only userspace code to
+## be built by this test, please set load_modules to 'no', and make sure
+## the kvm and kvm-[vendor] module is already loaded by the time you start
+## it.
+load_modules: no,
+
+## Install from a kvm release (mode: release). You can optionally
+## specify a release tag. If you omit it, the test will get the latest
+## release tag available.
+#release_tag: '84',
+release_dir: 'http://downloads.sourceforge.net/project/kvm/',
+# This is the place that contains the sourceforge project list of files
+release_listing: 'http://sourceforge.net/projects/kvm/files/',
+
+## Install from a kvm snapshot location (mode: snapshot). You can
+## optionally specify a snapshot date. If you omit it, the test will get
+## yesterday's snapshot.
+#snapshot_date: '20090712'
+#snapshot_dir: 'http://foo.org/kvm-snapshots/',
+
+## Install from a tarball (mode: localtar)
+#tarball: /tmp/kvm-84.tar.gz,
+
+## Install from a local source code dir (mode: localsrc)
+#srcdir: /path/to/source-dir
+
+

[PATCH 3/3] KVM test: Extend VM.create() method to support SHA1 check

2009-10-21 Thread Lucas Meneghel Rodrigues
Also, change variable names and messages to be more
generic.

Signed-off-by: Lucas Meneghel Rodrigues l...@redhat.com
---
 client/tests/kvm/kvm_vm.py |   21 ++---
 1 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/client/tests/kvm/kvm_vm.py b/client/tests/kvm/kvm_vm.py
index a8d96ca..3d604c4 100755
--- a/client/tests/kvm/kvm_vm.py
+++ b/client/tests/kvm/kvm_vm.py
@@ -339,20 +339,27 @@ class VM:
 if params.get(md5sum_1m):
 logging.debug(Comparing expected MD5 sum with MD5 sum of 
   first MB of ISO file...)
-actual_md5sum = kvm_utils.md5sum_file(iso, 1048576)
-expected_md5sum = params.get(md5sum_1m)
+actual_hash = kvm_utils.md5sum_file(iso, 1048576)
+expected_hash = params.get(md5sum_1m)
 compare = True
 elif params.get(md5sum):
 logging.debug(Comparing expected MD5 sum with MD5 sum of ISO 
   file...)
-actual_md5sum = kvm_utils.md5sum_file(iso)
-expected_md5sum = params.get(md5sum)
+actual_hash = kvm_utils.md5sum_file(iso)
+expected_hash = params.get(md5sum)
+compare = True
+elif params.get(sha1sum):
+logging.debug(Comparing expected SHA1 sum with SHA1 sum of 
+  ISO file...)
+actual_hash = kvm_utils.md5sum_file(iso)
+expected_hash = params.get(md5sum)
 compare = True
 if compare:
-if actual_md5sum == expected_md5sum:
-logging.debug(MD5 sums match)
+if actual_hash == expected_hash:
+logging.debug(Hashes match)
 else:
-logging.error(Actual MD5 sum differs from expected one)
+logging.error(Actual hash %s differs from expected 
+  one %s, actual_hash, expected_hash)
 return False
 
 # Make sure the following code is not executed by more than one thread
-- 
1.6.2.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] KVM test: Adding Fedora nightly host (example)

2009-10-21 Thread Lucas Meneghel Rodrigues
This is an example Fedora nightly host definition. It lacks
CD hashes because they will be verified as the DVD isos are
downloaded.

Signed-off-by: Lucas Meneghel Rodrigues l...@redhat.com
---
 client/tests/kvm/kvm_tests.cfg.sample |   17 +
 1 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/client/tests/kvm/kvm_tests.cfg.sample 
b/client/tests/kvm/kvm_tests.cfg.sample
index 296449d..964f8b7 100644
--- a/client/tests/kvm/kvm_tests.cfg.sample
+++ b/client/tests/kvm/kvm_tests.cfg.sample
@@ -340,6 +340,23 @@ variants:
 kernel_args = ks=floppy nicdelay=60
 unattended_file = unattended/Fedora-11.ks
 
+- nightly.32:
+no setup
+image_name = fedora-nightly-32
+unattended_install:
+tftp = images/tftpboot
+extra_params = -bootp /pxelinux.0 -boot n
+kernel_args = ks=floppy nicdelay=60
+unattended_file = unattended/Fedora-nightly.ks
+- nightly.64:
+no setup
+image_name = fedora-nightly-64
+unattended_install:
+tftp = images/tftpboot
+extra_params = -bootp /pxelinux.0 -boot n
+kernel_args = ks=floppy nicdelay=60
+unattended_file = unattended/Fedora-nightly.ks
+
 - DSL-4.2.5:
 no setup dbench bonnie linux_s3
 image_name = dsl-4.2.5
-- 
1.6.2.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] KVM test: Adding new Fedora-nightly.ks unattended file

2009-10-21 Thread Lucas Meneghel Rodrigues
Also, make a tiny change on Fedora-11.ks file by setting
up completely automated disk partitioning.

Signed-off-by: Lucas Meneghel Rodrigues l...@redhat.com
---
 client/tests/kvm/unattended/Fedora-11.ks  |7 ++--
 client/tests/kvm/unattended/Fedora-nightly.ks |   39 +
 2 files changed, 42 insertions(+), 4 deletions(-)
 create mode 100644 client/tests/kvm/unattended/Fedora-nightly.ks

diff --git a/client/tests/kvm/unattended/Fedora-11.ks 
b/client/tests/kvm/unattended/Fedora-11.ks
index cfd30ea..23f1d8a 100644
--- a/client/tests/kvm/unattended/Fedora-11.ks
+++ b/client/tests/kvm/unattended/Fedora-11.ks
@@ -12,11 +12,9 @@ timezone --utc America/New_York
 firstboot --disable
 bootloader --location=mbr
 zerombr
-
 clearpart --all --initlabel
-part /boot  --fstype=ext3 --size=100
-part /  --fstype=ext3 --size=9000
-part swap   --fstype=swap --size=512
+autopart
+reboot
 
 %packages
 @admin-tools
@@ -39,3 +37,4 @@ client.connect(addr)
 client.sendto('done', addr)
 client.close()
 %end
+
diff --git a/client/tests/kvm/unattended/Fedora-nightly.ks 
b/client/tests/kvm/unattended/Fedora-nightly.ks
new file mode 100644
index 000..34191ff
--- /dev/null
+++ b/client/tests/kvm/unattended/Fedora-nightly.ks
@@ -0,0 +1,39 @@
+install
+cdrom
+text
+reboot
+lang en_US
+keyboard us
+key --skip
+network --bootproto dhcp
+rootpw 123456
+firewall --enabled --ssh
+selinux --enforcing
+timezone --utc America/New_York
+firstboot --disable
+bootloader --location=mbr
+zerombr
+clearpart --all --initlabel
+autopart
+reboot
+
+%packages
+@ admin-tools
+@ base
+@ core
+@ development-libs
+@ development-tools
+
+%post --interpreter /usr/bin/python
+import socket, os
+os.system('chkconfig sshd on')
+os.system('iptables -F')
+os.system('echo 0  /selinux/enforce')
+port = 12323
+buf = 1024
+addr = ('10.0.2.2', port)
+client = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
+client.connect(addr)
+client.sendto('done', addr)
+client.close()
+
-- 
1.6.2.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[ kvm-Bugs-2883570 ] KVM needs easier network bridging support

2009-10-21 Thread SourceForge.net
Bugs item #2883570, was opened at 2009-10-21 15:16
Message generated for change (Tracker Item Submitted) made by dmitryb77
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detailatid=893831aid=2883570group_id=180599

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: dmitryb77 (dmitryb77)
Assigned to: Nobody/Anonymous (nobody)
Summary: KVM needs easier network bridging support

Initial Comment:
Setup:
Triple core Phenom II, 64-bit Ubuntu Karmic Koala, Ralink wireless NIC.

Repro:
1. Attempt to create a bridged connection using a wireless NIC.
2. Spend hours reading half-baked howtos and fail.
3. Throw in the towel and use VirtualBox where bridging just works after you 
select it through a combo box.

Result:
Wireless bridging is darn near impossible with KVM (yet VirtualBox bridges just 
fine on the same machine). Wired bridging requires manual (and hence error 
prone) editing of configuration files. This is compounded by the fact that if 
you screw up a network config on the remote host, you lose access to it and 
have to go there (or use an expensive enterprise remote console thingamabob) 
to correct your error.

Expected result:
From my (fairly limited) understanding, VirtualBox implements bridging to in 
their code using a driver. KVM should consider doing the same thing. 
Currently, bridging is a major pain in the ass, and NAT is useless for most 
intents and purposes, since you can't connect to a NATted machine from the 
outside.

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detailatid=893831aid=2883570group_id=180599
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Enable 32bit dirty log pointers on 64bit host

2009-10-21 Thread Alexander Graf
From: Arnd Bergmann a...@arndb.de

With big endian userspace, we can't quite figure out if a pointer
is 32 bit (shifted  32) or 64 bit when we read a 64 bit pointer.

This is what happens with dirty logging. To get the pointer interpreted
correctly, we thus need Arnd's patch to implement a compat layer for
the ioctl:

A better way to do this is to add a separate compat_ioctl() method that
converts this for you.

From: Arnd Bergmann a...@arndb.de
Signed-off-by: Arnd Bergmann a...@arndb.de
Acked-by: Alexander Graf ag...@suse.de

---

Changes from Arnd's example version:

  - s/log.log/log/ (Avi)
  - use sizeof(compat_log) (Avi)
  - compile fixes
---
 virt/kvm/kvm_main.c |   49 -
 1 files changed, 48 insertions(+), 1 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index cac69c4..54a272f 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -43,6 +43,7 @@
 #include linux/swap.h
 #include linux/bitops.h
 #include linux/spinlock.h
+#include linux/compat.h
 
 #include asm/processor.h
 #include asm/io.h
@@ -1542,6 +1543,52 @@ out:
return r;
 }
 
+#ifdef CONFIG_COMPAT
+struct compat_kvm_dirty_log {
+   __u32 slot;
+   __u32 padding1;
+   union {
+   compat_uptr_t dirty_bitmap; /* one bit per page */
+   __u64 padding2;
+   };
+};
+
+static long kvm_vm_compat_ioctl(struct file *filp,
+  unsigned int ioctl, unsigned long arg)
+{
+   struct kvm *kvm = filp-private_data;
+   int r;
+
+   if (kvm-mm != current-mm)
+   return -EIO;
+   switch (ioctl) {
+   case KVM_GET_DIRTY_LOG: {
+   struct compat_kvm_dirty_log compat_log;
+   struct kvm_dirty_log log;
+
+   r = -EFAULT;
+   if (copy_from_user(compat_log, (void __user *)arg,
+  sizeof(compat_log)))
+   goto out;
+   log.slot = compat_log.slot;
+   log.padding1 = compat_log.padding1;
+   log.padding2 = compat_log.padding2;
+   log.dirty_bitmap = compat_ptr(compat_log.dirty_bitmap);
+
+   r = kvm_vm_ioctl_get_dirty_log(kvm, log);
+   if (r)
+   goto out;
+   break;
+   }
+   default:
+   r = kvm_vm_ioctl(filp, ioctl, arg);
+   }
+
+out:
+   return r;
+}
+#endif
+
 static int kvm_vm_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 {
struct page *page[1];
@@ -1576,7 +1623,7 @@ static int kvm_vm_mmap(struct file *file, struct 
vm_area_struct *vma)
 static struct file_operations kvm_vm_fops = {
.release= kvm_vm_release,
.unlocked_ioctl = kvm_vm_ioctl,
-   .compat_ioctl   = kvm_vm_ioctl,
+   .compat_ioctl   = kvm_vm_compat_ioctl,
.mmap   = kvm_vm_mmap,
 };
 
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 09/27] Add interrupt handling code

2009-10-21 Thread Alexander Graf
Getting from host state to the guest is only half the story. We also need
to return to our host context and handle whatever happened to get us out of
the guest.

On PowerPC every guest exit is an interrupt. So all we need to do is trap
the host's interrupt handlers and get into our #VMEXIT code to handle it.

PowerPCs also have a register that can add an offset to the interrupt handlers'
adresses which is what the booke KVM code uses. Unfortunately that is a
hypervisor ressource and we also want to be able to run KVM when we're running
in an LPAR. So we have to hook into the Linux interrupt handlers.

Signed-off-by: Alexander Graf ag...@suse.de

---

v3 - v4:

  - header rename fix
---
 arch/powerpc/kvm/book3s_64_rmhandlers.S |  131 +++
 1 files changed, 131 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/kvm/book3s_64_rmhandlers.S

diff --git a/arch/powerpc/kvm/book3s_64_rmhandlers.S 
b/arch/powerpc/kvm/book3s_64_rmhandlers.S
new file mode 100644
index 000..fb7dd2e
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_64_rmhandlers.S
@@ -0,0 +1,131 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2009
+ *
+ * Authors: Alexander Graf ag...@suse.de
+ */
+
+#include asm/ppc_asm.h
+#include asm/kvm_asm.h
+#include asm/reg.h
+#include asm/page.h
+#include asm/asm-offsets.h
+#include asm/exception-64s.h
+
+/*
+ *   *
+ *Real Mode handlers that need to be in low physical memory  *
+ *   *
+ /
+
+
+.macro INTERRUPT_TRAMPOLINE intno
+
+.global kvmppc_trampoline_\intno
+kvmppc_trampoline_\intno:
+
+   mtspr   SPRN_SPRG_SCRATCH0, r13 /* Save r13 */
+
+   /*
+* First thing to do is to find out if we're coming
+* from a KVM guest or a Linux process.
+*
+* To distinguish, we check a magic byte in the PACA
+*/
+   mfspr   r13, SPRN_SPRG_PACA /* r13 = PACA */
+   std r12, (PACA_EXMC + EX_R12)(r13)
+   mfcrr12
+   stw r12, (PACA_EXMC + EX_CCR)(r13)
+   lbz r12, PACA_KVM_IN_GUEST(r13)
+   cmpwi   r12, 0
+   bne ..kvmppc_handler_hasmagic_\intno
+   /* No KVM guest? Then jump back to the Linux handler! */
+   lwz r12, (PACA_EXMC + EX_CCR)(r13)
+   mtcrr12
+   ld  r12, (PACA_EXMC + EX_R12)(r13)
+   mfspr   r13, SPRN_SPRG_SCRATCH0 /* r13 = original r13 */
+   b   kvmppc_resume_\intno/* Get back original handler */
+
+   /* Now we know we're handling a KVM guest */
+..kvmppc_handler_hasmagic_\intno:
+   /* Unset guest state */
+   li  r12, 0
+   stb r12, PACA_KVM_IN_GUEST(r13)
+
+   std r1, (PACA_EXMC+EX_R9)(r13)
+   std r10, (PACA_EXMC+EX_R10)(r13)
+   std r11, (PACA_EXMC+EX_R11)(r13)
+   std r2, (PACA_EXMC+EX_R13)(r13)
+
+   mfsrr0  r10
+   mfsrr1  r11
+
+   /* Restore R1/R2 so we can handle faults */
+   ld  r1, PACAR1(r13)
+   ld  r2, (PACA_EXMC+EX_SRR0)(r13)
+
+   /* Let's store which interrupt we're handling */
+   li  r12, \intno
+
+   /* Jump into the SLB exit code that goes to the highmem handler */
+   b   kvmppc_handler_trampoline_exit
+
+.endm
+
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_SYSTEM_RESET
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_MACHINE_CHECK
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_DATA_STORAGE
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_DATA_SEGMENT
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_INST_STORAGE
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_INST_SEGMENT
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_EXTERNAL
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_ALIGNMENT
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_PROGRAM
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_FP_UNAVAIL
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_DECREMENTER
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_SYSCALL
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_TRACE
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_PERFMON
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_ALTIVEC
+INTERRUPT_TRAMPOLINE   

[PATCH 27/27] Use hrtimers for the decrementer

2009-10-21 Thread Alexander Graf
Following S390's good example we should use hrtimers for the decrementer too!
This patch converts the timer from the old mechanism to hrtimers.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_host.h |6 --
 arch/powerpc/kvm/emulate.c  |   18 +++---
 arch/powerpc/kvm/powerpc.c  |   20 ++--
 3 files changed, 33 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 2cff5fe..1201f62 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -21,7 +21,8 @@
 #define __POWERPC_KVM_HOST_H__
 
 #include linux/mutex.h
-#include linux/timer.h
+#include linux/hrtimer.h
+#include linux/interrupt.h
 #include linux/types.h
 #include linux/kvm_types.h
 #include asm/kvm_asm.h
@@ -250,7 +251,8 @@ struct kvm_vcpu_arch {
 
u32 cpr0_cfgaddr; /* holds the last set cpr0_cfgaddr */
 
-   struct timer_list dec_timer;
+   struct hrtimer dec_timer;
+   struct tasklet_struct tasklet;
u64 dec_jiffies;
unsigned long pending_exceptions;
 
diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c
index 1ec5e07..4a9ac66 100644
--- a/arch/powerpc/kvm/emulate.c
+++ b/arch/powerpc/kvm/emulate.c
@@ -18,7 +18,7 @@
  */
 
 #include linux/jiffies.h
-#include linux/timer.h
+#include linux/hrtimer.h
 #include linux/types.h
 #include linux/string.h
 #include linux/kvm_host.h
@@ -79,12 +79,13 @@ static int kvmppc_dec_enabled(struct kvm_vcpu *vcpu)
 
 void kvmppc_emulate_dec(struct kvm_vcpu *vcpu)
 {
-   unsigned long nr_jiffies;
+   unsigned long dec_nsec;
 
+   pr_debug(mtDEC: %x\n, vcpu-arch.dec);
 #ifdef CONFIG_PPC64
/* POWER4+ triggers a dec interrupt if the value is  0 */
if (vcpu-arch.dec  0x8000) {
-   del_timer(vcpu-arch.dec_timer);
+   hrtimer_try_to_cancel(vcpu-arch.dec_timer);
kvmppc_core_queue_dec(vcpu);
return;
}
@@ -94,12 +95,15 @@ void kvmppc_emulate_dec(struct kvm_vcpu *vcpu)
 * that's how we convert the guest DEC value to the number of
 * host ticks. */
 
+   hrtimer_try_to_cancel(vcpu-arch.dec_timer);
+   dec_nsec = vcpu-arch.dec;
+   dec_nsec *= 1000;
+   dec_nsec /= tb_ticks_per_usec;
+   hrtimer_start(vcpu-arch.dec_timer, ktime_set(0, dec_nsec),
+ HRTIMER_MODE_REL);
vcpu-arch.dec_jiffies = get_tb();
-   nr_jiffies = vcpu-arch.dec / tb_ticks_per_jiffy;
-   mod_timer(vcpu-arch.dec_timer,
- get_jiffies_64() + nr_jiffies);
} else {
-   del_timer(vcpu-arch.dec_timer);
+   hrtimer_try_to_cancel(vcpu-arch.dec_timer);
}
 }
 
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 4ae3490..4c582ed 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -23,6 +23,7 @@
 #include linux/kvm_host.h
 #include linux/module.h
 #include linux/vmalloc.h
+#include linux/hrtimer.h
 #include linux/fs.h
 #include asm/cputable.h
 #include asm/uaccess.h
@@ -209,10 +210,25 @@ static void kvmppc_decrementer_func(unsigned long data)
}
 }
 
+/*
+ * low level hrtimer wake routine. Because this runs in hardirq context
+ * we schedule a tasklet to do the real work.
+ */
+enum hrtimer_restart kvmppc_decrementer_wakeup(struct hrtimer *timer)
+{
+   struct kvm_vcpu *vcpu;
+
+   vcpu = container_of(timer, struct kvm_vcpu, arch.dec_timer);
+   tasklet_schedule(vcpu-arch.tasklet);
+
+   return HRTIMER_NORESTART;
+}
+
 int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
 {
-   setup_timer(vcpu-arch.dec_timer, kvmppc_decrementer_func,
-   (unsigned long)vcpu);
+   hrtimer_init(vcpu-arch.dec_timer, CLOCK_REALTIME, HRTIMER_MODE_ABS);
+   tasklet_init(vcpu-arch.tasklet, kvmppc_decrementer_func, (ulong)vcpu);
+   vcpu-arch.dec_timer.function = kvmppc_decrementer_wakeup;
 
return 0;
 }
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 26/27] Use Little Endian for Dirty Bitmap

2009-10-21 Thread Alexander Graf
We currently use host endian long types to store information
in the dirty bitmap.

This works reasonably well on Little Endian targets, because the
u32 after the first contains the next 32 bits. On Big Endian this
breaks completely though, forcing us to be inventive here.

So Ben suggested to always use Little Endian, which looks reasonable.

We only have dirty bitmap implemented in Little Endian targets so far
and since PowerPC would be the first Big Endian platform, we can just
as well switch to Little Endian always with little effort without
breaking existing targets.

Signed-off-by: Alexander Graf ag...@suse.de
---
 virt/kvm/kvm_main.c |5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 54a272f..c565e5b 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -49,6 +49,7 @@
 #include asm/io.h
 #include asm/uaccess.h
 #include asm/pgtable.h
+#include asm-generic/bitops/le.h
 
 #ifdef KVM_COALESCED_MMIO_PAGE_OFFSET
 #include coalesced_mmio.h
@@ -1071,8 +1072,8 @@ void mark_page_dirty(struct kvm *kvm, gfn_t gfn)
unsigned long rel_gfn = gfn - memslot-base_gfn;
 
/* avoid RMW */
-   if (!test_bit(rel_gfn, memslot-dirty_bitmap))
-   set_bit(rel_gfn, memslot-dirty_bitmap);
+   if (!generic_test_le_bit(rel_gfn, memslot-dirty_bitmap))
+   generic___set_le_bit(rel_gfn, memslot-dirty_bitmap);
}
 }
 
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 03/27] Add Book3s definitions

2009-10-21 Thread Alexander Graf
We need quite a bunch of new constants for KVM on Book3s,
so let's define them now.

These constants will be used in later patches.

Signed-off-by: Alexander Graf ag...@suse.de

---

v3 - v4

  - remove old kernel compat code
---
 arch/powerpc/include/asm/kvm_asm.h |   39 
 1 files changed, 39 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_asm.h 
b/arch/powerpc/include/asm/kvm_asm.h
index 56bfae5..19ddb35 100644
--- a/arch/powerpc/include/asm/kvm_asm.h
+++ b/arch/powerpc/include/asm/kvm_asm.h
@@ -49,6 +49,45 @@
 #define BOOKE_INTERRUPT_SPE_FP_ROUND 34
 #define BOOKE_INTERRUPT_PERFORMANCE_MONITOR 35
 
+/* book3s */
+
+#define BOOK3S_INTERRUPT_SYSTEM_RESET  0x100
+#define BOOK3S_INTERRUPT_MACHINE_CHECK 0x200
+#define BOOK3S_INTERRUPT_DATA_STORAGE  0x300
+#define BOOK3S_INTERRUPT_DATA_SEGMENT  0x380
+#define BOOK3S_INTERRUPT_INST_STORAGE  0x400
+#define BOOK3S_INTERRUPT_INST_SEGMENT  0x480
+#define BOOK3S_INTERRUPT_EXTERNAL  0x500
+#define BOOK3S_INTERRUPT_ALIGNMENT 0x600
+#define BOOK3S_INTERRUPT_PROGRAM   0x700
+#define BOOK3S_INTERRUPT_FP_UNAVAIL0x800
+#define BOOK3S_INTERRUPT_DECREMENTER   0x900
+#define BOOK3S_INTERRUPT_SYSCALL   0xc00
+#define BOOK3S_INTERRUPT_TRACE 0xd00
+#define BOOK3S_INTERRUPT_PERFMON   0xf00
+#define BOOK3S_INTERRUPT_ALTIVEC   0xf20
+#define BOOK3S_INTERRUPT_VSX   0xf40
+
+#define BOOK3S_IRQPRIO_SYSTEM_RESET0
+#define BOOK3S_IRQPRIO_DATA_SEGMENT1
+#define BOOK3S_IRQPRIO_INST_SEGMENT2
+#define BOOK3S_IRQPRIO_DATA_STORAGE3
+#define BOOK3S_IRQPRIO_INST_STORAGE4
+#define BOOK3S_IRQPRIO_ALIGNMENT   5
+#define BOOK3S_IRQPRIO_PROGRAM 6
+#define BOOK3S_IRQPRIO_FP_UNAVAIL  7
+#define BOOK3S_IRQPRIO_ALTIVEC 8
+#define BOOK3S_IRQPRIO_VSX 9
+#define BOOK3S_IRQPRIO_SYSCALL 10
+#define BOOK3S_IRQPRIO_MACHINE_CHECK   11
+#define BOOK3S_IRQPRIO_DEBUG   12
+#define BOOK3S_IRQPRIO_EXTERNAL13
+#define BOOK3S_IRQPRIO_DECREMENTER 14
+#define BOOK3S_IRQPRIO_PERFORMANCE_MONITOR 15
+#define BOOK3S_IRQPRIO_MAX 16
+
+#define BOOK3S_HFLAG_DCBZ320x1
+
 #define RESUME_FLAG_NV  (10)  /* Reload guest nonvolatile state? */
 #define RESUME_FLAG_HOST(11)  /* Resume host? */
 
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 20/27] Split init_new_context and destroy_context

2009-10-21 Thread Alexander Graf
For KVM we need to allocate a new context id, but don't really care about
all the mm context around it.

So let's split the alloc and destroy functions for the context id, so we can
grab one without allocating an mm context.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/mmu_context.h |5 +
 arch/powerpc/mm/mmu_context_hash64.c   |   24 +---
 2 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu_context.h 
b/arch/powerpc/include/asm/mmu_context.h
index b34e94d..66b35d0 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -23,6 +23,11 @@ extern void switch_slb(struct task_struct *tsk, struct 
mm_struct *mm);
 extern void set_context(unsigned long id, pgd_t *pgd);
 
 #ifdef CONFIG_PPC_BOOK3S_64
+extern int __init_new_context(void);
+extern void __destroy_context(int context_id);
+#endif
+
+#ifdef CONFIG_PPC_BOOK3S_64
 static inline void mmu_context_init(void) { }
 #else
 extern void mmu_context_init(void);
diff --git a/arch/powerpc/mm/mmu_context_hash64.c 
b/arch/powerpc/mm/mmu_context_hash64.c
index dbeb86a..b9e4cc2 100644
--- a/arch/powerpc/mm/mmu_context_hash64.c
+++ b/arch/powerpc/mm/mmu_context_hash64.c
@@ -18,6 +18,7 @@
 #include linux/mm.h
 #include linux/spinlock.h
 #include linux/idr.h
+#include linux/module.h
 
 #include asm/mmu_context.h
 
@@ -32,7 +33,7 @@ static DEFINE_IDR(mmu_context_idr);
 #define NO_CONTEXT 0
 #define MAX_CONTEXT((1UL  19) - 1)
 
-int init_new_context(struct task_struct *tsk, struct mm_struct *mm)
+int __init_new_context(void)
 {
int index;
int err;
@@ -57,6 +58,18 @@ again:
return -ENOMEM;
}
 
+   return index;
+}
+EXPORT_SYMBOL_GPL(__init_new_context);
+
+int init_new_context(struct task_struct *tsk, struct mm_struct *mm)
+{
+   int index;
+
+   index = __init_new_context();
+   if (index  0)
+   return index;
+
/* The old code would re-promote on fork, we don't do that
 * when using slices as it could cause problem promoting slices
 * that have been forced down to 4K
@@ -68,11 +81,16 @@ again:
return 0;
 }
 
-void destroy_context(struct mm_struct *mm)
+void __destroy_context(int context_id)
 {
spin_lock(mmu_context_lock);
-   idr_remove(mmu_context_idr, mm-context.id);
+   idr_remove(mmu_context_idr, context_id);
spin_unlock(mmu_context_lock);
+}
+EXPORT_SYMBOL_GPL(__destroy_context);
 
+void destroy_context(struct mm_struct *mm)
+{
+   __destroy_context(mm-context.id);
mm-context.id = NO_CONTEXT;
 }
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 17/27] Make head_64.S aware of KVM real mode code

2009-10-21 Thread Alexander Graf
We need to run some KVM trampoline code in real mode. Unfortunately, real mode
only covers 8MB on Cell so we need to squeeze ourselves as low as possible.

Also, we need to trap interrupts to get us back from guest state to host state
without telling Linux about it.

This patch adds interrupt traps and includes the KVM code that requires real
mode in the real mode parts of Linux.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/exception-64s.h |2 ++
 arch/powerpc/kernel/exceptions-64s.S |8 
 arch/powerpc/kernel/head_64.S|7 +++
 3 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h 
b/arch/powerpc/include/asm/exception-64s.h
index a98653b..57c4000 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -147,6 +147,7 @@
.globl label##_pSeries; \
 label##_pSeries:   \
HMT_MEDIUM; \
+   DO_KVM  n;  \
mtspr   SPRN_SPRG_SCRATCH0,r13; /* save r13 */  \
EXCEPTION_PROLOG_PSERIES(PACA_EXGEN, label##_common)
 
@@ -170,6 +171,7 @@ label##_pSeries:\
.globl label##_pSeries; \
 label##_pSeries:   \
HMT_MEDIUM; \
+   DO_KVM  n;  \
mtspr   SPRN_SPRG_SCRATCH0,r13; /* save r13 */  \
mfspr   r13,SPRN_SPRG_PACA; /* get paca address into r13 */ \
std r9,PACA_EXGEN+EX_R9(r13);   /* save r9, r10 */  \
diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index 1808876..fc3ead0 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -41,6 +41,7 @@ __start_interrupts:
. = 0x200
 _machine_check_pSeries:
HMT_MEDIUM
+   DO_KVM  0x200
mtspr   SPRN_SPRG_SCRATCH0,r13  /* save r13 */
EXCEPTION_PROLOG_PSERIES(PACA_EXMC, machine_check_common)
 
@@ -48,6 +49,7 @@ _machine_check_pSeries:
.globl data_access_pSeries
 data_access_pSeries:
HMT_MEDIUM
+   DO_KVM  0x300
mtspr   SPRN_SPRG_SCRATCH0,r13
 BEGIN_FTR_SECTION
mfspr   r13,SPRN_SPRG_PACA
@@ -77,6 +79,7 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_SLB)
.globl data_access_slb_pSeries
 data_access_slb_pSeries:
HMT_MEDIUM
+   DO_KVM  0x380
mtspr   SPRN_SPRG_SCRATCH0,r13
mfspr   r13,SPRN_SPRG_PACA  /* get paca address into r13 */
std r3,PACA_EXSLB+EX_R3(r13)
@@ -115,6 +118,7 @@ data_access_slb_pSeries:
.globl instruction_access_slb_pSeries
 instruction_access_slb_pSeries:
HMT_MEDIUM
+   DO_KVM  0x480
mtspr   SPRN_SPRG_SCRATCH0,r13
mfspr   r13,SPRN_SPRG_PACA  /* get paca address into r13 */
std r3,PACA_EXSLB+EX_R3(r13)
@@ -154,6 +158,7 @@ instruction_access_slb_pSeries:
.globl  system_call_pSeries
 system_call_pSeries:
HMT_MEDIUM
+   DO_KVM  0xc00
 BEGIN_FTR_SECTION
cmpdi   r0,0x1ebe
beq-1f
@@ -186,12 +191,15 @@ END_FTR_SECTION_IFSET(CPU_FTR_REAL_LE)
 * trickery is thus necessary
 */
. = 0xf00
+   DO_KVM  0xf00
b   performance_monitor_pSeries
 
. = 0xf20
+   DO_KVM  0xf20
b   altivec_unavailable_pSeries
 
. = 0xf40
+   DO_KVM  0xf40
b   vsx_unavailable_pSeries
 
 #ifdef CONFIG_CBE_RAS
diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
index c38afdb..9258074 100644
--- a/arch/powerpc/kernel/head_64.S
+++ b/arch/powerpc/kernel/head_64.S
@@ -37,6 +37,7 @@
 #include asm/firmware.h
 #include asm/page_64.h
 #include asm/irqflags.h
+#include asm/kvm_book3s_64_asm.h
 
 /* The physical memory is layed out such that the secondary processor
  * spin code sits at 0x...0x00ff. On server, the vectors follow
@@ -165,6 +166,12 @@ exception_marker:
 #include exceptions-64s.S
 #endif
 
+/* KVM trampoline code needs to be close to the interrupt handlers */
+
+#ifdef CONFIG_KVM_BOOK3S_64_HANDLER
+#include ../kvm/book3s_64_rmhandlers.S
+#endif
+
 _GLOBAL(generic_secondary_thread_init)
mr  r24,r3
 
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 15/27] Add mfdec emulation

2009-10-21 Thread Alexander Graf
We support setting the DEC to a certain value right now. Doing that basically
triggers the CPU local timer.

But there's also an mfdec command that enabled the OS to read the decrementor.

This is required at least by all desktop and server PowerPC Linux kernels. It
can't really hurt to allow embedded ones to do it as well though.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/emulate.c |   13 -
 1 files changed, 12 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c
index 7737146..50d411d 100644
--- a/arch/powerpc/kvm/emulate.c
+++ b/arch/powerpc/kvm/emulate.c
@@ -66,12 +66,14 @@
 
 void kvmppc_emulate_dec(struct kvm_vcpu *vcpu)
 {
+   unsigned long nr_jiffies;
+
if (vcpu-arch.tcr  TCR_DIE) {
/* The decrementer ticks at the same rate as the timebase, so
 * that's how we convert the guest DEC value to the number of
 * host ticks. */
-   unsigned long nr_jiffies;
 
+   vcpu-arch.dec_jiffies = mftb();
nr_jiffies = vcpu-arch.dec / tb_ticks_per_jiffy;
mod_timer(vcpu-arch.dec_timer,
  get_jiffies_64() + nr_jiffies);
@@ -211,6 +213,15 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct 
kvm_vcpu *vcpu)
/* Note: SPRG4-7 are user-readable, so we don't get
 * a trap. */
 
+   case SPRN_DEC:
+   {
+   u64 jd = mftb() - vcpu-arch.dec_jiffies;
+   vcpu-arch.gpr[rt] = vcpu-arch.dec - jd;
+#ifdef DEBUG_EMUL
+   printk(KERN_INFO mfDEC: %x - %llx = %lx\n, 
vcpu-arch.dec, jd, vcpu-arch.gpr[rt]);
+#endif
+   break;
+   }
default:
emulated = kvmppc_core_emulate_mfspr(vcpu, 
sprn, rt);
if (emulated == EMULATE_FAIL) {
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 14/27] Add book3s_64 specific opcode emulation

2009-10-21 Thread Alexander Graf
There are generic parts of PowerPC that can be shared across all
implementations and specific parts that only apply to BookE or desktop PPCs.

This patch adds emulation for desktop specific opcodes that don't apply
to BookE CPUs.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_64_emulate.c |  338 ++
 1 files changed, 338 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/kvm/book3s_64_emulate.c

diff --git a/arch/powerpc/kvm/book3s_64_emulate.c 
b/arch/powerpc/kvm/book3s_64_emulate.c
new file mode 100644
index 000..60cd64a
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_64_emulate.c
@@ -0,0 +1,338 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2009
+ *
+ * Authors: Alexander Graf ag...@suse.de
+ */
+
+#include asm/kvm_ppc.h
+#include asm/disassemble.h
+#include asm/kvm_book3s.h
+#include asm/reg.h
+
+#define OP_19_XOP_RFID 18
+#define OP_19_XOP_RFI  50
+
+#define OP_31_XOP_MFMSR83
+#define OP_31_XOP_MTMSR146
+#define OP_31_XOP_MTMSRD   178
+#define OP_31_XOP_MTSRIN   242
+#define OP_31_XOP_TLBIEL   274
+#define OP_31_XOP_TLBIE306
+#define OP_31_XOP_SLBMTE   402
+#define OP_31_XOP_SLBIE434
+#define OP_31_XOP_SLBIA498
+#define OP_31_XOP_MFSRIN   659
+#define OP_31_XOP_SLBMFEV  851
+#define OP_31_XOP_EIOIO854
+#define OP_31_XOP_SLBMFEE  915
+// DCBZ is actually 1014, but we patch it to 1010 so we get a trap
+#define OP_31_XOP_DCBZ 1010
+
+int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu,
+   unsigned int inst, int *advance)
+{
+   int emulated = EMULATE_DONE;
+
+   switch (get_op(inst)) {
+   case 19:
+   switch (get_xop(inst)) {
+   case OP_19_XOP_RFID:
+   case OP_19_XOP_RFI:
+   vcpu-arch.pc = vcpu-arch.srr0;
+   kvmppc_set_msr(vcpu, vcpu-arch.srr1);
+   *advance = 0;
+   break;
+
+   default:
+   emulated = EMULATE_FAIL;
+   break;
+   }
+   break;
+   case 31:
+   switch (get_xop(inst)) {
+   case OP_31_XOP_MFMSR:
+   vcpu-arch.gpr[get_rt(inst)] = vcpu-arch.msr;
+   break;
+   case OP_31_XOP_MTMSRD:
+   {
+   ulong rs = vcpu-arch.gpr[get_rs(inst)];
+   if (inst  0x1) {
+   vcpu-arch.msr = ~(MSR_RI | MSR_EE);
+   vcpu-arch.msr |= rs  (MSR_RI | MSR_EE);
+   } else
+   kvmppc_set_msr(vcpu, rs);
+   break;
+   }
+   case OP_31_XOP_MTMSR:
+   kvmppc_set_msr(vcpu, vcpu-arch.gpr[get_rs(inst)]);
+   break;
+   case OP_31_XOP_MFSRIN:
+   {
+   int srnum;
+
+   srnum = (vcpu-arch.gpr[get_rb(inst)]  28)  0xf;
+   if (vcpu-arch.mmu.mfsrin) {
+   u32 sr;
+   sr = vcpu-arch.mmu.mfsrin(vcpu, srnum);
+   vcpu-arch.gpr[get_rt(inst)] = sr;
+   }
+   break;
+   }
+   case OP_31_XOP_MTSRIN:
+   vcpu-arch.mmu.mtsrin(vcpu,
+   (vcpu-arch.gpr[get_rb(inst)]  28)  0xf,
+   vcpu-arch.gpr[get_rs(inst)]);
+   break;
+   case OP_31_XOP_TLBIE:
+   case OP_31_XOP_TLBIEL:
+   {
+   bool large = (inst  0x0020) ? true : false;
+   ulong addr = vcpu-arch.gpr[get_rb(inst)];
+   vcpu-arch.mmu.tlbie(vcpu, addr, large);
+   break;
+   }
+   case OP_31_XOP_EIOIO:
+   break;
+   case OP_31_XOP_SLBMTE:
+   if (!vcpu-arch.mmu.slbmte)
+   return EMULATE_FAIL;
+
+

[PATCH 13/27] Add book3s_32 guest MMU

2009-10-21 Thread Alexander Graf
This patch adds an implementation for a G3/G4 MMU, so we can run G3 and
G4 guests in KVM on Book3s_64.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_32_mmu.c |  354 ++
 1 files changed, 354 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/kvm/book3s_32_mmu.c

diff --git a/arch/powerpc/kvm/book3s_32_mmu.c b/arch/powerpc/kvm/book3s_32_mmu.c
new file mode 100644
index 000..134c186
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_32_mmu.c
@@ -0,0 +1,354 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2009
+ *
+ * Authors: Alexander Graf ag...@suse.de
+ */
+
+#include linux/types.h
+#include linux/string.h
+#include linux/kvm.h
+#include linux/kvm_host.h
+#include linux/highmem.h
+
+#include asm/tlbflush.h
+#include asm/kvm_ppc.h
+#include asm/kvm_book3s.h
+
+// #define DEBUG_MMU
+// #define DEBUG_MMU_PTE
+// #define DEBUG_MMU_PTE_IP 0xfff14c40
+
+static int kvmppc_mmu_book3s_32_xlate_bat(struct kvm_vcpu *vcpu, gva_t eaddr, 
struct kvmppc_pte *pte, bool data);
+
+static struct kvmppc_sr *kvmppc_mmu_book3s_32_find_sr(
+   struct kvmppc_vcpu_book3s *vcpu_book3s, gva_t eaddr)
+{
+   return vcpu_book3s-sr[(eaddr  28)  0xf];
+}
+
+static u64 kvmppc_mmu_book3s_32_ea_to_vp(struct kvm_vcpu *vcpu, gva_t eaddr, 
bool data)
+{
+   struct kvmppc_sr *sre = kvmppc_mmu_book3s_32_find_sr(to_book3s(vcpu), 
eaddr);
+   struct kvmppc_pte pte;
+
+   if (!kvmppc_mmu_book3s_32_xlate_bat(vcpu, eaddr, pte, data))
+   return pte.vpage;
+
+   return (((u64)eaddr  12)  0x) | (((u64)sre-vsid)  16);
+}
+
+static void kvmppc_mmu_book3s_32_reset_msr(struct kvm_vcpu *vcpu)
+{
+   kvmppc_set_msr(vcpu, 0);
+}
+
+static hva_t kvmppc_mmu_book3s_32_get_pteg(struct kvmppc_vcpu_book3s 
*vcpu_book3s,
+ struct kvmppc_sr *sre, gva_t eaddr,
+ bool primary)
+{
+   u32 page, hash, pteg, htabmask;
+   hva_t r;
+
+   page = (eaddr  0x0FFF)  12;
+   htabmask = ((vcpu_book3s-sdr1  0x1FF)  16) | 0xFFC0;
+
+   hash = ((sre-vsid ^ page)  6);
+   if (!primary)
+   hash = ~hash;
+   hash = htabmask;
+
+   pteg = (vcpu_book3s-sdr1  0x) | hash;
+
+#ifdef DEBUG_MMU
+   printk(KERN_INFO MMU: pc=0x%lx eaddr=0x%lx sdr1=0x%llx pteg=0x%x 
vsid=0x%x\n,
+   vcpu_book3s-vcpu.arch.pc, eaddr, vcpu_book3s-sdr1, 
pteg, sre-vsid);
+#endif
+
+   r = gfn_to_hva(vcpu_book3s-vcpu.kvm, pteg  PAGE_SHIFT);
+   if (kvm_is_error_hva(r))
+   return r;
+   return r | (pteg  ~PAGE_MASK);
+}
+
+static u32 kvmppc_mmu_book3s_32_get_ptem(struct kvmppc_sr *sre, gva_t eaddr,
+   bool primary)
+{
+   return ((eaddr  0x0fff)  22) | (sre-vsid  7) |
+  (primary ? 0 : 0x40) | 0x8000;
+}
+
+static int kvmppc_mmu_book3s_32_xlate_bat(struct kvm_vcpu *vcpu, gva_t eaddr, 
struct kvmppc_pte *pte, bool data)
+{
+   struct kvmppc_vcpu_book3s *vcpu_book3s = to_book3s(vcpu);
+   struct kvmppc_bat *bat;
+   int i;
+
+   for (i = 0; i  8; i++) {
+   if (data)
+   bat = vcpu_book3s-dbat[i];
+   else
+   bat = vcpu_book3s-ibat[i];
+
+   if (vcpu-arch.msr  MSR_PR) {
+   if (!bat-vp)
+   continue;
+   } else {
+   if (!bat-vs)
+   continue;
+   }
+
+#ifdef DEBUG_MMU_PTE
+#ifdef DEBUG_MMU_PTE_IP
+   if (vcpu-arch.pc == DEBUG_MMU_PTE_IP)
+#endif
+   {
+   printk(KERN_INFO %cBAT %02d: 0x%lx - 0x%x (0x%x)\n, data ? 'd' : 'i', 
i, eaddr, bat-bepi, bat-bepi_mask);
+   }
+#endif
+   if ((eaddr  bat-bepi_mask) == bat-bepi) {
+   pte-raddr = bat-brpn | (eaddr  ~bat-bepi_mask);
+   pte-vpage = (eaddr  12) | VSID_BAT;
+   pte-may_read = bat-pp;
+   pte-may_write = bat-pp  1;
+   pte-may_execute = true;
+   if (!pte-may_read) {
+   printk(KERN_INFO BAT is not readable!\n);
+   continue;
+   }
+   

[PATCH 02/27] Pass PVR in sregs

2009-10-21 Thread Alexander Graf
Right now sregs is unused on PPC, so we can use it for initialization
of the CPU.

KVM on BookE always virtualizes the host CPU. On Book3s we go a step further
and take the PVR from userspace that tells us what kind of CPU we are supposed
to virtualize, because we support Book3s_32 and Book3s_64 guests.

In order to get that information, we use the sregs ioctl, because we don't
want to reset the guest CPU on every normal register set.

Signed-off-by: Alexander Graf ag...@suse.de

---

v4 - v5

  - make PVR 32 bits
---
 arch/powerpc/include/asm/kvm.h |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm.h b/arch/powerpc/include/asm/kvm.h
index bb2de6a..c9ca97f 100644
--- a/arch/powerpc/include/asm/kvm.h
+++ b/arch/powerpc/include/asm/kvm.h
@@ -46,6 +46,8 @@ struct kvm_regs {
 };
 
 struct kvm_sregs {
+   __u32 pvr;
+   char pad[1020];
 };
 
 struct kvm_fpu {
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 24/27] Include Book3s_64 target in buildsystem

2009-10-21 Thread Alexander Graf
Now we have everything in place to be able to build KVM, so let's add it
as config option and in the Makefile.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/Kconfig  |   17 +
 arch/powerpc/kvm/Makefile |   27 +++
 2 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
index c299268..07703f7 100644
--- a/arch/powerpc/kvm/Kconfig
+++ b/arch/powerpc/kvm/Kconfig
@@ -21,6 +21,23 @@ config KVM
select PREEMPT_NOTIFIERS
select ANON_INODES
 
+config KVM_BOOK3S_64_HANDLER
+   bool
+
+config KVM_BOOK3S_64
+   tristate KVM support for PowerPC book3s_64 processors
+   depends on EXPERIMENTAL  PPC64
+   select KVM
+   select KVM_BOOK3S_64_HANDLER
+   ---help---
+ Support running unmodified book3s_64 and book3s_32 guest kernels
+ in virtual machines on book3s_64 host processors.
+
+ This module provides access to the hardware capabilities through
+ a character device node named /dev/kvm.
+
+ If unsure, say N.
+
 config KVM_440
bool KVM support for PowerPC 440 processors
depends on EXPERIMENTAL  44x
diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile
index 37655fe..56484d6 100644
--- a/arch/powerpc/kvm/Makefile
+++ b/arch/powerpc/kvm/Makefile
@@ -12,26 +12,45 @@ CFLAGS_44x_tlb.o  := -I.
 CFLAGS_e500_tlb.o := -I.
 CFLAGS_emulate.o  := -I.
 
-kvm-objs := $(common-objs-y) powerpc.o emulate.o
+common-objs-y += powerpc.o emulate.o
 obj-$(CONFIG_KVM_EXIT_TIMING) += timing.o
-obj-$(CONFIG_KVM) += kvm.o
+obj-$(CONFIG_KVM_BOOK3S_64_HANDLER) += book3s_64_exports.o
 
 AFLAGS_booke_interrupts.o := -I$(obj)
 
 kvm-440-objs := \
+   $(common-objs-y) \
booke.o \
booke_emulate.o \
booke_interrupts.o \
44x.o \
44x_tlb.o \
44x_emulate.o
-obj-$(CONFIG_KVM_440) += kvm-440.o
+kvm-objs-$(CONFIG_KVM_440) := $(kvm-440-objs)
 
 kvm-e500-objs := \
+   $(common-objs-y) \
booke.o \
booke_emulate.o \
booke_interrupts.o \
e500.o \
e500_tlb.o \
e500_emulate.o
-obj-$(CONFIG_KVM_E500) += kvm-e500.o
+kvm-objs-$(CONFIG_KVM_E500) := $(kvm-e500-objs)
+
+kvm-book3s_64-objs := \
+   $(common-objs-y) \
+   book3s.o \
+   book3s_64_emulate.o \
+   book3s_64_interrupts.o \
+   book3s_64_mmu_host.o \
+   book3s_64_mmu.o \
+   book3s_32_mmu.o
+kvm-objs-$(CONFIG_KVM_BOOK3S_64) := $(kvm-book3s_64-objs)
+
+kvm-objs := $(kvm-objs-m) $(kvm-objs-y)
+
+obj-$(CONFIG_KVM_440) += kvm.o
+obj-$(CONFIG_KVM_E500) += kvm.o
+obj-$(CONFIG_KVM_BOOK3S_64) += kvm.o
+
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 16/27] Add desktop PowerPC specific emulation

2009-10-21 Thread Alexander Graf
Little opcodes behave differently on desktop and embedded PowerPC cores.
In order to reflect those differences, let's add some #ifdef code to emulate.c.

We could probably also handle them in the core specific emulation files, but I
would prefer to reuse as much code as possible.

Signed-off-by: Alexander Graf ag...@suse.de

---

v4 - v5:

  - use get_tb instead of mftb
  - make ppc32 and ppc64 emulation share more code
---
 arch/powerpc/kvm/emulate.c |   49 +++-
 1 files changed, 39 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c
index 50d411d..1ec5e07 100644
--- a/arch/powerpc/kvm/emulate.c
+++ b/arch/powerpc/kvm/emulate.c
@@ -32,6 +32,7 @@
 #include trace.h
 
 #define OP_TRAP 3
+#define OP_TRAP_64 2
 
 #define OP_31_XOP_LWZX  23
 #define OP_31_XOP_LBZX  87
@@ -64,16 +65,36 @@
 #define OP_STH  44
 #define OP_STHU 45
 
+#ifdef CONFIG_PPC64
+static int kvmppc_dec_enabled(struct kvm_vcpu *vcpu)
+{
+   return 1;
+}
+#else
+static int kvmppc_dec_enabled(struct kvm_vcpu *vcpu)
+{
+   return vcpu-arch.tcr  TCR_DIE;
+}
+#endif
+
 void kvmppc_emulate_dec(struct kvm_vcpu *vcpu)
 {
unsigned long nr_jiffies;
 
-   if (vcpu-arch.tcr  TCR_DIE) {
+#ifdef CONFIG_PPC64
+   /* POWER4+ triggers a dec interrupt if the value is  0 */
+   if (vcpu-arch.dec  0x8000) {
+   del_timer(vcpu-arch.dec_timer);
+   kvmppc_core_queue_dec(vcpu);
+   return;
+   }
+#endif
+   if (kvmppc_dec_enabled(vcpu)) {
/* The decrementer ticks at the same rate as the timebase, so
 * that's how we convert the guest DEC value to the number of
 * host ticks. */
 
-   vcpu-arch.dec_jiffies = mftb();
+   vcpu-arch.dec_jiffies = get_tb();
nr_jiffies = vcpu-arch.dec / tb_ticks_per_jiffy;
mod_timer(vcpu-arch.dec_timer,
  get_jiffies_64() + nr_jiffies);
@@ -113,9 +134,15 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct 
kvm_vcpu *vcpu)
/* this default type might be overwritten by subcategories */
kvmppc_set_exit_type(vcpu, EMULATED_INST_EXITS);
 
+   pr_debug(KERN_INFO Emulating opcode %d / %d\n, get_op(inst), 
get_xop(inst));
+
switch (get_op(inst)) {
case OP_TRAP:
+#ifdef CONFIG_PPC64
+   case OP_TRAP_64:
+#else
vcpu-arch.esr |= ESR_PTR;
+#endif
kvmppc_core_queue_program(vcpu);
advance = 0;
break;
@@ -190,17 +217,19 @@ int kvmppc_emulate_instruction(struct kvm_run *run, 
struct kvm_vcpu *vcpu)
case SPRN_SRR1:
vcpu-arch.gpr[rt] = vcpu-arch.srr1; break;
case SPRN_PVR:
-   vcpu-arch.gpr[rt] = mfspr(SPRN_PVR); break;
+   vcpu-arch.gpr[rt] = vcpu-arch.pvr; break;
case SPRN_PIR:
-   vcpu-arch.gpr[rt] = mfspr(SPRN_PIR); break;
+   vcpu-arch.gpr[rt] = vcpu-vcpu_id; break;
+   case SPRN_MSSSR0:
+   vcpu-arch.gpr[rt] = 0; break;
 
/* Note: mftb and TBRL/TBWL are user-accessible, so
 * the guest can always access the real TB anyways.
 * In fact, we probably will never see these traps. */
case SPRN_TBWL:
-   vcpu-arch.gpr[rt] = mftbl(); break;
+   vcpu-arch.gpr[rt] = get_tb()  32; break;
case SPRN_TBWU:
-   vcpu-arch.gpr[rt] = mftbu(); break;
+   vcpu-arch.gpr[rt] = get_tb(); break;
 
case SPRN_SPRG0:
vcpu-arch.gpr[rt] = vcpu-arch.sprg0; break;
@@ -215,11 +244,9 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct 
kvm_vcpu *vcpu)
 
case SPRN_DEC:
{
-   u64 jd = mftb() - vcpu-arch.dec_jiffies;
+   u64 jd = get_tb() - vcpu-arch.dec_jiffies;
vcpu-arch.gpr[rt] = vcpu-arch.dec - jd;
-#ifdef DEBUG_EMUL
-   printk(KERN_INFO mfDEC: %x - %llx = %lx\n, 
vcpu-arch.dec, jd, vcpu-arch.gpr[rt]);
-#endif
+   pr_debug(KERN_INFO mfDEC: %x - %llx = %lx\n, 
vcpu-arch.dec, jd, vcpu-arch.gpr[rt]);
break;
}
default:
@@ -271,6 +298,8 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct 
kvm_vcpu *vcpu)
case SPRN_TBWL: break;
case SPRN_TBWU: break;
 
+   case SPRN_MSSSR0: break;
+
   

[PATCH 04/27] Add Book3s fields to vcpu structs

2009-10-21 Thread Alexander Graf
We need to store more information than we currently have for vcpus
when running on Book3s.

So let's extend the internal struct definitions.

Signed-off-by: Alexander Graf ag...@suse.de

---

v3 - v4:

  - use context_id instead of mm_context

v4 - v5:

  - always include pvr in vcpu struct
---
 arch/powerpc/include/asm/kvm_host.h |   73 ++-
 1 files changed, 72 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index c9c930e..2cff5fe 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -37,6 +37,8 @@
 #define KVM_NR_PAGE_SIZES  1
 #define KVM_PAGES_PER_HPAGE(x) (1UL31)
 
+#define HPTEG_CACHE_NUM 1024
+
 struct kvm;
 struct kvm_run;
 struct kvm_vcpu;
@@ -63,6 +65,17 @@ struct kvm_vcpu_stat {
u32 dec_exits;
u32 ext_intr_exits;
u32 halt_wakeup;
+#ifdef CONFIG_PPC64
+   u32 pf_storage;
+   u32 pf_instruc;
+   u32 sp_storage;
+   u32 sp_instruc;
+   u32 queue_intr;
+   u32 ld;
+   u32 ld_slow;
+   u32 st;
+   u32 st_slow;
+#endif
 };
 
 enum kvm_exit_types {
@@ -109,9 +122,53 @@ struct kvmppc_exit_timing {
 struct kvm_arch {
 };
 
+struct kvmppc_pte {
+   u64 eaddr;
+   u64 vpage;
+   u64 raddr;
+   bool may_read;
+   bool may_write;
+   bool may_execute;
+};
+
+struct kvmppc_mmu {
+   /* book3s_64 only */
+   void (*slbmte)(struct kvm_vcpu *vcpu, u64 rb, u64 rs);
+   u64  (*slbmfee)(struct kvm_vcpu *vcpu, u64 slb_nr);
+   u64  (*slbmfev)(struct kvm_vcpu *vcpu, u64 slb_nr);
+   void (*slbie)(struct kvm_vcpu *vcpu, u64 slb_nr);
+   void (*slbia)(struct kvm_vcpu *vcpu);
+   /* book3s */
+   void (*mtsrin)(struct kvm_vcpu *vcpu, u32 srnum, ulong value);
+   u32  (*mfsrin)(struct kvm_vcpu *vcpu, u32 srnum);
+   int  (*xlate)(struct kvm_vcpu *vcpu, gva_t eaddr, struct kvmppc_pte 
*pte, bool data);
+   void (*reset_msr)(struct kvm_vcpu *vcpu);
+   void (*tlbie)(struct kvm_vcpu *vcpu, ulong addr, bool large);
+   int  (*esid_to_vsid)(struct kvm_vcpu *vcpu, u64 esid, u64 *vsid);
+   u64  (*ea_to_vp)(struct kvm_vcpu *vcpu, gva_t eaddr, bool data);
+   bool (*is_dcbz32)(struct kvm_vcpu *vcpu);
+};
+
+struct hpte_cache {
+   u64 host_va;
+   u64 pfn;
+   ulong slot;
+   struct kvmppc_pte pte;
+};
+
 struct kvm_vcpu_arch {
-   u32 host_stack;
+   ulong host_stack;
u32 host_pid;
+#ifdef CONFIG_PPC64
+   ulong host_msr;
+   ulong host_r2;
+   void *host_retip;
+   ulong trampoline_lowmem;
+   ulong trampoline_enter;
+   ulong highmem_handler;
+   ulong host_paca_phys;
+   struct kvmppc_mmu mmu;
+#endif
 
u64 fpr[32];
ulong gpr[32];
@@ -123,6 +180,10 @@ struct kvm_vcpu_arch {
ulong xer;
 
ulong msr;
+#ifdef CONFIG_PPC64
+   ulong shadow_msr;
+   ulong hflags;
+#endif
u32 mmucr;
ulong sprg0;
ulong sprg1;
@@ -149,6 +210,7 @@ struct kvm_vcpu_arch {
u32 ivor[64];
ulong ivpr;
u32 pir;
+   u32 pvr;
 
u32 shadow_pid;
u32 pid;
@@ -174,6 +236,9 @@ struct kvm_vcpu_arch {
 #endif
 
u32 last_inst;
+#ifdef CONFIG_PPC64
+   ulong fault_dsisr;
+#endif
ulong fault_dear;
ulong fault_esr;
gpa_t paddr_accessed;
@@ -186,7 +251,13 @@ struct kvm_vcpu_arch {
u32 cpr0_cfgaddr; /* holds the last set cpr0_cfgaddr */
 
struct timer_list dec_timer;
+   u64 dec_jiffies;
unsigned long pending_exceptions;
+
+#ifdef CONFIG_PPC64
+   struct hpte_cache hpte_cache[HPTEG_CACHE_NUM];
+   int hpte_cache_offset;
+#endif
 };
 
 #endif /* __POWERPC_KVM_HOST_H__ */
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 08/27] Add SLB switching code for entry/exit

2009-10-21 Thread Alexander Graf
This is the really low level of guest entry/exit code.

Book3s_64 has an SLB, which stores all ESID - VSID mappings we're
currently aware of.

The segments in the guest differ from the ones on the host, so we need
to switch the SLB to tell the MMU that we're in a new context.

So we store a shadow of the guest's SLB in the PACA, switch to that on
entry and only restore bolted entries on exit, leaving the rest to the
Linux SLB fault handler.

That way we get a really clean way of switching the SLB.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_64_slb.S |  277 ++
 1 files changed, 277 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/kvm/book3s_64_slb.S

diff --git a/arch/powerpc/kvm/book3s_64_slb.S b/arch/powerpc/kvm/book3s_64_slb.S
new file mode 100644
index 000..00a8367
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_64_slb.S
@@ -0,0 +1,277 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2009
+ *
+ * Authors: Alexander Graf ag...@suse.de
+ */
+
+/**
+ **
+ *   Entry code   *
+ **
+ */
+
+.global kvmppc_handler_trampoline_enter
+kvmppc_handler_trampoline_enter:
+
+   /* Required state:
+*
+* MSR = ~IR|DR
+* R13 = PACA
+* R9 = guest IP
+* R10 = guest MSR
+* R11 = free
+* R12 = free
+* PACA[PACA_EXMC + EX_R9] = guest R9
+* PACA[PACA_EXMC + EX_R10] = guest R10
+* PACA[PACA_EXMC + EX_R11] = guest R11
+* PACA[PACA_EXMC + EX_R12] = guest R12
+* PACA[PACA_EXMC + EX_R13] = guest R13
+* PACA[PACA_EXMC + EX_CCR] = guest CR
+* PACA[PACA_EXMC + EX_R3] = guest XER
+*/
+
+   mtsrr0  r9
+   mtsrr1  r10
+
+   mtspr   SPRN_SPRG_SCRATCH0, r0
+
+   /* Remove LPAR shadow entries */
+
+#if SLB_NUM_BOLTED == 3
+
+   ld  r12, PACA_SLBSHADOWPTR(r13)
+   ld  r10, 0x10(r12)
+   ld  r11, 0x18(r12)
+   /* Invalid? Skip. */
+   rldicl. r0, r10, 37, 63
+   beq slb_entry_skip_1
+   xoris   r9, r10, slb_esi...@h
+   std r9, 0x10(r12)
+slb_entry_skip_1:
+   ld  r9, 0x20(r12)
+   /* Invalid? Skip. */
+   rldicl. r0, r9, 37, 63
+   beq slb_entry_skip_2
+   xoris   r9, r9, slb_esi...@h
+   std r9, 0x20(r12)
+slb_entry_skip_2:
+   ld  r9, 0x30(r12)
+   /* Invalid? Skip. */
+   rldicl. r0, r9, 37, 63
+   beq slb_entry_skip_3
+   xoris   r9, r9, slb_esi...@h
+   std r9, 0x30(r12)
+slb_entry_skip_3:
+   
+#else
+#error unknown number of bolted entries
+#endif
+
+   /* Flush SLB */
+
+   slbia
+
+   /* r0 = esid  ESID_MASK */
+   rldicr  r10, r10, 0, 35
+   /* r0 |= CLASS_BIT(VSID) */
+   rldic   r12, r11, 56 - 36, 36
+   or  r10, r10, r12
+   slbie   r10
+
+   isync
+
+   /* Fill SLB with our shadow */
+
+   lbz r12, PACA_KVM_SLB_MAX(r13)
+   mulli   r12, r12, 16
+   addir12, r12, PACA_KVM_SLB
+   add r12, r12, r13
+
+   /* for (r11 = kvm_slb; r11  kvm_slb + kvm_slb_size; r11+=slb_entry) */
+   li  r11, PACA_KVM_SLB
+   add r11, r11, r13
+
+slb_loop_enter:
+
+   ld  r10, 0(r11)
+
+   rldicl. r0, r10, 37, 63
+   beq slb_loop_enter_skip
+
+   ld  r9, 8(r11)
+   slbmte  r9, r10
+
+slb_loop_enter_skip:
+   addir11, r11, 16
+   cmpdcr0, r11, r12
+   blt slb_loop_enter
+
+slb_do_enter:
+
+   /* Enter guest */
+
+   mfspr   r0, SPRN_SPRG_SCRATCH0
+
+   ld  r9, (PACA_EXMC+EX_R9)(r13)
+   ld  r10, (PACA_EXMC+EX_R10)(r13)
+   ld  r12, (PACA_EXMC+EX_R12)(r13)
+
+   lwz r11, (PACA_EXMC+EX_CCR)(r13)
+   mtcrr11
+
+   ld  r11, (PACA_EXMC+EX_R3)(r13)
+   mtxer   r11
+
+   ld  r11, (PACA_EXMC+EX_R11)(r13)
+   ld  r13, (PACA_EXMC+EX_R13)(r13)
+
+   RFI
+kvmppc_handler_trampoline_enter_end:
+

Re: [PATCH 00/27] Add KVM support for Book3s_64 (PPC64) hosts v5

2009-10-21 Thread Alexander Graf


On 21.10.2009, at 17:03, Alexander Graf wrote:


KVM for PowerPC only supports embedded cores at the moment.

While it makes sense to virtualize on small machines, it's even more  
fun

to do so on big boxes. So I figured we need KVM for PowerPC64 as well.

This patchset implements KVM support for Book3s_64 hosts and guest  
support

for Book3s_64 and G3/G4.

To really make use of this, you also need a recent version of qemu.


Don't want to apply patches? Get the git tree!

$ git clone git://csgraf.de/kvm
$ git checkout origin/ppc-v4


ppc-v5 of course. Though I'm still trying to take git to actually  
serve the correct tree - sigh.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html