KVM Guest keymap issue
Dear list, I have a problem with a Windows XP guest that I connect to via VNC and is using sl keymap (option -k sl). The guest is Windows XP and the problematic characters are s, c and z with caron... when I type them via VNC, they are not printed at all in virtual system... I have checked the file /usr/share/kvm/keymaps/sl and it seems that it contains different codes than I get when doing showkey --ascii on the host machine (running Ubuntu 12.04). I have tried to change the KVM's keymap file 'sl' with the codes I get from showkey, but they are still not printed in virtual system to which I am connected via VNC... I am totally lost with this issue, thanks for your time and ideas. Best regards, Matej -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 5/6 v5] kvm: booke: clear host tlb reference flag on guest tlb invalidation
On Thu, 2013-09-19 at 23:19 -0500, Bhushan Bharat-R65777 wrote: -Original Message- From: Wood Scott-B07421 Sent: Friday, September 20, 2013 2:38 AM To: Bhushan Bharat-R65777 Cc: b...@kernel.crashing.org; ag...@suse.de; pau...@samba.org; kvm@vger.kernel.org; kvm-...@vger.kernel.org; linuxppc-...@lists.ozlabs.org; Bhushan Bharat-R65777 Subject: Re: [PATCH 5/6 v5] kvm: booke: clear host tlb reference flag on guest tlb invalidation This breaks when you have both E500_TLB_BITMAP and E500_TLB_TLB0 set. I do not see any case where we set both E500_TLB_BITMAP and E500_TLB_TLB0. This would happen if you have a guest TLB1 entry that is backed by some 4K pages and some larger pages (e.g. if the guest maps CCSR with one big TLB1 and there are varying I/O passthrough regions mapped). It's not common, but it's possible. Also we have not optimized that yet (keeping track of multiple shadow TLB0 entries for one guest TLB1 entry) This is about correctness, not optimization. We uses these bit flags only for TLB1 and if size of stlbe is 4K then we set E500_TLB_TLB0 otherwise we set E500_TLB_BITMAP. Although I think that E500_TLB_BITMAP should be set only if stlbe size is less than gtlbe size. Why? Even if there's only one bit set in the map, we need it to keep track of which entry was used. -Scott -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PULL 02/13] cpu: Move cpu state syncs up into cpu_dump_state()
From: James Hogan james.ho...@imgtec.com The x86 and ppc targets call cpu_synchronize_state() from their *_cpu_dump_state() callbacks to ensure that up to date state is dumped when KVM is enabled (for example when a KVM internal error occurs). Move this call up into the generic cpu_dump_state() function so that other KVM targets (namely MIPS) can take advantage of it. This requires kvm_cpu_synchronize_state() and cpu_synchronize_state() to be moved out of the #ifdef NEED_CPU_H in sysemu/kvm.h so that they're accessible to qom/cpu.c. Signed-off-by: James Hogan james.ho...@imgtec.com Cc: Andreas Färber afaer...@suse.de Cc: Alexander Graf ag...@suse.de Cc: Gleb Natapov g...@redhat.com Cc: qemu-...@nongnu.org Cc: kvm@vger.kernel.org Signed-off-by: Gleb Natapov g...@redhat.com --- include/sysemu/kvm.h | 20 ++-- qom/cpu.c | 1 + target-i386/helper.c | 2 -- target-ppc/translate.c | 2 -- 4 files changed, 11 insertions(+), 14 deletions(-) diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h index 8e76685..3b0ef46 100644 --- a/include/sysemu/kvm.h +++ b/include/sysemu/kvm.h @@ -270,16 +270,6 @@ int kvm_check_extension(KVMState *s, unsigned int extension); uint32_t kvm_arch_get_supported_cpuid(KVMState *env, uint32_t function, uint32_t index, int reg); -void kvm_cpu_synchronize_state(CPUState *cpu); - -/* generic hooks - to be moved/refactored once there are more users */ - -static inline void cpu_synchronize_state(CPUState *cpu) -{ -if (kvm_enabled()) { -kvm_cpu_synchronize_state(cpu); -} -} #if !defined(CONFIG_USER_ONLY) int kvm_physical_memory_addr_from_host(KVMState *s, void *ram_addr, @@ -288,9 +278,19 @@ int kvm_physical_memory_addr_from_host(KVMState *s, void *ram_addr, #endif /* NEED_CPU_H */ +void kvm_cpu_synchronize_state(CPUState *cpu); void kvm_cpu_synchronize_post_reset(CPUState *cpu); void kvm_cpu_synchronize_post_init(CPUState *cpu); +/* generic hooks - to be moved/refactored once there are more users */ + +static inline void cpu_synchronize_state(CPUState *cpu) +{ +if (kvm_enabled()) { +kvm_cpu_synchronize_state(cpu); +} +} + static inline void cpu_synchronize_post_reset(CPUState *cpu) { if (kvm_enabled()) { diff --git a/qom/cpu.c b/qom/cpu.c index fa7ec6b..818fb26 100644 --- a/qom/cpu.c +++ b/qom/cpu.c @@ -162,6 +162,7 @@ void cpu_dump_state(CPUState *cpu, FILE *f, fprintf_function cpu_fprintf, CPUClass *cc = CPU_GET_CLASS(cpu); if (cc-dump_state) { +cpu_synchronize_state(cpu); cc-dump_state(cpu, f, cpu_fprintf, flags); } } diff --git a/target-i386/helper.c b/target-i386/helper.c index 7c58e27..0ad7c8e 100644 --- a/target-i386/helper.c +++ b/target-i386/helper.c @@ -188,8 +188,6 @@ void x86_cpu_dump_state(CPUState *cs, FILE *f, fprintf_function cpu_fprintf, char cc_op_name[32]; static const char *seg_name[6] = { ES, CS, SS, DS, FS, GS }; -cpu_synchronize_state(cs); - eflags = cpu_compute_eflags(env); #ifdef TARGET_X86_64 if (env-hflags HF_CS64_MASK) { diff --git a/target-ppc/translate.c b/target-ppc/translate.c index 2da7bc7..9c59f69 100644 --- a/target-ppc/translate.c +++ b/target-ppc/translate.c @@ -9536,8 +9536,6 @@ void ppc_cpu_dump_state(CPUState *cs, FILE *f, fprintf_function cpu_fprintf, CPUPPCState *env = cpu-env; int i; -cpu_synchronize_state(cs); - cpu_fprintf(f, NIP TARGET_FMT_lxLR TARGET_FMT_lx CTR TARGET_FMT_lx XER TARGET_FMT_lx \n, env-nip, env-lr, env-ctr, cpu_read_xer(env)); -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: question regarding early ssh with kvm
Hi Stefan Everything looks as expected. Even more wierd: when i execute the script 01remote-ssh.sh as stated in http://roosbertl.blogspot.ch/2012/12/centos6-disk-encryption-with-remote.html i can connect with ssh to that machine, so its not a kvm problem i guess. Many thanks for your help. Btw.: when running on centos and not knowing how to get a initramfs interactive shell, simply append rdshell to the kernel params (grub) and hit 2 times ctrl+c on the password prompt for the hdd/luks encryption. Regards, Oliver Am 19.09.2013 13:40, schrieb Stefan Hajnoczi: On Wed, Sep 18, 2013 at 09:44:48PM +0200, Oliver Zemann wrote: I am able now to print some messages like lsmod, ip addr show etc. I loaded virtio_net and virtio_pci, there are also a few more, but eth0 is still unknown to the system. Do i need any other module? Check that the virtio-net PCI adapter is present: $ grep 1af41000 /proc/bus/pci/devices The output should print many fields and end with virtio_pci (the driver that is bound to this device). If you get no output from this grep command then your QEMU command-line does not define a virtio-net PCI device. If you get output but the last field is empty or ? then you are missing virtio kernel modules. This can happen either because you didn't compile them or because the udev device aliases file hasn't been updated to autoload the right kernel module. Stefan -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/3] KVM: Make kvm_lock non-raw
[Re: [PATCH 0/3] KVM: Make kvm_lock non-raw] On 16/09/2013 (Mon 18:12) Paul Gortmaker wrote: On 13-09-16 10:06 AM, Paolo Bonzini wrote: Paul Gortmaker reported a BUG on preempt-rt kernels, due to taking the mmu_lock within the raw kvm_lock in mmu_shrink_scan. He provided a patch that shrunk the kvm_lock critical section so that the mmu_lock critical section does not nest with it, but in the end there is no reason for the vm_list to be protected by a raw spinlock. Only manipulations of kvm_usage_count and the consequent hardware_enable/disable operations are not preemptable. This small series thus splits the kvm_lock in the raw part and the non-raw part. Paul, could you please provide your Tested-by? Sure, I'll go back and see if I can find what triggered it in the original report, and give the patches a spin on 3.4.x-rt (and probably 3.10.x-rt, since that is where rt-current is presently). Seems fine on 3.4-rt. On 3.10.10-rt7 it looks like there are other issues, probably not explicitly related to this patchset (see below). Paul. -- e1000e :00:19.0 eth1: removed PHC assign device 0:0:19.0 pci :00:19.0: irq 43 for MSI/MSI-X pci :00:19.0: irq 43 for MSI/MSI-X pci :00:19.0: irq 43 for MSI/MSI-X pci :00:19.0: irq 43 for MSI/MSI-X BUG: sleeping function called from invalid context at /home/paul/git/linux-rt/kernel/rtmutex.c:659 in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/0 2 locks held by swapper/0/0: #0: (rcu_read_lock){.+.+.+}, at: [8100998a] kvm_set_irq_inatomic+0x2a/0x4a0 #1: (rcu_read_lock){.+.+.+}, at: [81038800] kvm_irq_delivery_to_apic_fast+0x60/0x3d0 irq event stamp: 6121390 hardirqs last enabled at (6121389): [819f9ae0] restore_args+0x0/0x30 hardirqs last disabled at (6121390): [819f9a2a] common_interrupt+0x6a/0x6f softirqs last enabled at (0): [ (null)] (null) softirqs last disabled at (0): [ (null)] (null) Preemption disabled at:[810ebb9a] cpu_startup_entry+0x1ba/0x430 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.10.10-rt7 #2 Hardware name: Dell Inc. OptiPlex 990/0VNP2H, BIOS A17 03/14/2013 8201c440 880223603cf0 819f177d 880223603d18 810c90d3 880214a50110 0001 0001 880223603d38 819f89a4 880214a50110 880214a50110 Call Trace: IRQ [819f177d] dump_stack+0x19/0x1b [810c90d3] __might_sleep+0x153/0x250 [819f89a4] rt_spin_lock+0x24/0x60 [810ccdd6] __wake_up+0x36/0x70 [81003bbb] kvm_vcpu_kick+0x3b/0xd0 [810371a2] __apic_accept_irq+0x2b2/0x3a0 [810385f7] kvm_apic_set_irq+0x27/0x30 [8103894e] kvm_irq_delivery_to_apic_fast+0x1ae/0x3d0 [81038800] ? kvm_irq_delivery_to_apic_fast+0x60/0x3d0 [81009a8b] kvm_set_irq_inatomic+0x12b/0x4a0 [8100998a] ? kvm_set_irq_inatomic+0x2a/0x4a0 [8100c5b3] kvm_assigned_dev_msi+0x23/0x40 [8113cb38] handle_irq_event_percpu+0x88/0x3d0 [810ebb7c] ? cpu_startup_entry+0x19c/0x430 [8113cec8] handle_irq_event+0x48/0x70 [8113f9b7] handle_edge_irq+0x77/0x120 [8104c6ae] handle_irq+0x1e/0x30 [81a035ca] do_IRQ+0x5a/0xd0 [819f9a2f] common_interrupt+0x6f/0x6f EOI [819f9ae0] ? retint_restore_args+0xe/0xe [810ebb7c] ? cpu_startup_entry+0x19c/0x430 [810ebb38] ? cpu_startup_entry+0x158/0x430 [819db767] rest_init+0x137/0x140 [819db635] ? rest_init+0x5/0x140 [822fde18] start_kernel+0x3af/0x3bc [822fd870] ? repair_env_string+0x5e/0x5e [822fd5a5] x86_64_start_reservations+0x2a/0x2c [822fd673] x86_64_start_kernel+0xcc/0xcf = [ INFO: inconsistent lock state ] 3.10.10-rt7 #2 Not tainted - inconsistent {HARDIRQ-ON-W} - {IN-HARDIRQ-W} usage. swapper/0/0 [HC1[1]:SC0[0]:HE0:SE1] takes: (((q-lock)-lock)-wait_lock){?.+.-.}, at: [819f7e98] rt_spin_lock_slowlock+0x48/0x370 {HARDIRQ-ON-W} state was registered at: [810fc94d] __lock_acquire+0x69d/0x20e0 [810feaee] lock_acquire+0x9e/0x1f0 [819f9090] _raw_spin_lock+0x40/0x80 [819f7e98] rt_spin_lock_slowlock+0x48/0x370 [819f89ac] rt_spin_lock+0x2c/0x60 [810ccdd6] __wake_up+0x36/0x70 [8109c5ce] run_timer_softirq+0x1be/0x390 [81092a09] do_current_softirqs+0x239/0x5b0 [81092db8] run_ksoftirqd+0x38/0x60 [810c5d7c] smpboot_thread_fn+0x22c/0x340 [810bbf4d] kthread+0xcd/0xe0 [81a019dc] ret_from_fork+0x7c/0xb0 irq event stamp: 6121390 hardirqs last enabled at (6121389): [819f9ae0] restore_args+0x0/0x30 hardirqs last disabled at (6121390): [819f9a2a] common_interrupt+0x6a/0x6f softirqs last enabled at (0): [ (null)] (null) softirqs last disabled at (0): [
RE: [PATCH 5/6 v5] kvm: booke: clear host tlb reference flag on guest tlb invalidation
-Original Message- From: Wood Scott-B07421 Sent: Friday, September 20, 2013 11:38 PM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; b...@kernel.crashing.org; ag...@suse.de; pau...@samba.org; kvm@vger.kernel.org; kvm-...@vger.kernel.org; linuxppc- d...@lists.ozlabs.org Subject: Re: [PATCH 5/6 v5] kvm: booke: clear host tlb reference flag on guest tlb invalidation On Fri, 2013-09-20 at 13:04 -0500, Bhushan Bharat-R65777 wrote: -Original Message- From: Wood Scott-B07421 Sent: Friday, September 20, 2013 9:48 PM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; b...@kernel.crashing.org; ag...@suse.de; pau...@samba.org; kvm@vger.kernel.org; kvm-...@vger.kernel.org; linuxppc- d...@lists.ozlabs.org Subject: Re: [PATCH 5/6 v5] kvm: booke: clear host tlb reference flag on guest tlb invalidation On Thu, 2013-09-19 at 23:19 -0500, Bhushan Bharat-R65777 wrote: We uses these bit flags only for TLB1 and if size of stlbe is 4K then we set E500_TLB_TLB0 otherwise we set E500_TLB_BITMAP. Although I think that E500_TLB_BITMAP should be set only if stlbe size is less than gtlbe size. Why? Even if there's only one bit set in the map, we need it to keep track of which entry was used. If there is one entry then will not this be simple/faster to not lookup bitmap and guest-host array? A flag indicate it is 1:1 map and this is physical address. The difference would be negligible, and you'd have added overhead (both runtime and complexity) of making this a special case. May be you are right , I will see if I can give a try :) BTW I have already sent v6 of this patch. -Bharat -Scott
RE: [PATCH 5/6 v5] kvm: booke: clear host tlb reference flag on guest tlb invalidation
-Original Message- From: Wood Scott-B07421 Sent: Friday, September 20, 2013 9:48 PM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; b...@kernel.crashing.org; ag...@suse.de; pau...@samba.org; kvm@vger.kernel.org; kvm-...@vger.kernel.org; linuxppc- d...@lists.ozlabs.org Subject: Re: [PATCH 5/6 v5] kvm: booke: clear host tlb reference flag on guest tlb invalidation On Thu, 2013-09-19 at 23:19 -0500, Bhushan Bharat-R65777 wrote: -Original Message- From: Wood Scott-B07421 Sent: Friday, September 20, 2013 2:38 AM To: Bhushan Bharat-R65777 Cc: b...@kernel.crashing.org; ag...@suse.de; pau...@samba.org; kvm@vger.kernel.org; kvm-...@vger.kernel.org; linuxppc-...@lists.ozlabs.org; Bhushan Bharat-R65777 Subject: Re: [PATCH 5/6 v5] kvm: booke: clear host tlb reference flag on guest tlb invalidation This breaks when you have both E500_TLB_BITMAP and E500_TLB_TLB0 set. I do not see any case where we set both E500_TLB_BITMAP and E500_TLB_TLB0. This would happen if you have a guest TLB1 entry that is backed by some 4K pages and some larger pages (e.g. if the guest maps CCSR with one big TLB1 and there are varying I/O passthrough regions mapped). It's not common, but it's possible. Agree Also we have not optimized that yet (keeping track of multiple shadow TLB0 entries for one guest TLB1 entry) This is about correctness, not optimization. We uses these bit flags only for TLB1 and if size of stlbe is 4K then we set E500_TLB_TLB0 otherwise we set E500_TLB_BITMAP. Although I think that E500_TLB_BITMAP should be set only if stlbe size is less than gtlbe size. Why? Even if there's only one bit set in the map, we need it to keep track of which entry was used. If there is one entry then will not this be simple/faster to not lookup bitmap and guest-host array? A flag indicate it is 1:1 map and this is physical address. -Bharat -Scott N�r��yb�X��ǧv�^�){.n�+h����ܨ}���Ơz�j:+v���zZ+��+zf���h���~i���z��w���?��)ߢf
Re: [PATCH 0/3] KVM: Make kvm_lock non-raw
On 13-09-20 02:04 PM, Jan Kiszka wrote: On 2013-09-20 19:51, Paul Gortmaker wrote: [Re: [PATCH 0/3] KVM: Make kvm_lock non-raw] On 16/09/2013 (Mon 18:12) Paul Gortmaker wrote: On 13-09-16 10:06 AM, Paolo Bonzini wrote: Paul Gortmaker reported a BUG on preempt-rt kernels, due to taking the mmu_lock within the raw kvm_lock in mmu_shrink_scan. He provided a patch that shrunk the kvm_lock critical section so that the mmu_lock critical section does not nest with it, but in the end there is no reason for the vm_list to be protected by a raw spinlock. Only manipulations of kvm_usage_count and the consequent hardware_enable/disable operations are not preemptable. This small series thus splits the kvm_lock in the raw part and the non-raw part. Paul, could you please provide your Tested-by? Sure, I'll go back and see if I can find what triggered it in the original report, and give the patches a spin on 3.4.x-rt (and probably 3.10.x-rt, since that is where rt-current is presently). Seems fine on 3.4-rt. On 3.10.10-rt7 it looks like there are other issues, probably not explicitly related to this patchset (see below). Paul. -- e1000e :00:19.0 eth1: removed PHC assign device 0:0:19.0 pci :00:19.0: irq 43 for MSI/MSI-X pci :00:19.0: irq 43 for MSI/MSI-X pci :00:19.0: irq 43 for MSI/MSI-X pci :00:19.0: irq 43 for MSI/MSI-X BUG: sleeping function called from invalid context at /home/paul/git/linux-rt/kernel/rtmutex.c:659 in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/0 2 locks held by swapper/0/0: #0: (rcu_read_lock){.+.+.+}, at: [8100998a] kvm_set_irq_inatomic+0x2a/0x4a0 #1: (rcu_read_lock){.+.+.+}, at: [81038800] kvm_irq_delivery_to_apic_fast+0x60/0x3d0 irq event stamp: 6121390 hardirqs last enabled at (6121389): [819f9ae0] restore_args+0x0/0x30 hardirqs last disabled at (6121390): [819f9a2a] common_interrupt+0x6a/0x6f softirqs last enabled at (0): [ (null)] (null) softirqs last disabled at (0): [ (null)] (null) Preemption disabled at:[810ebb9a] cpu_startup_entry+0x1ba/0x430 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.10.10-rt7 #2 Hardware name: Dell Inc. OptiPlex 990/0VNP2H, BIOS A17 03/14/2013 8201c440 880223603cf0 819f177d 880223603d18 810c90d3 880214a50110 0001 0001 880223603d38 819f89a4 880214a50110 880214a50110 Call Trace: IRQ [819f177d] dump_stack+0x19/0x1b [810c90d3] __might_sleep+0x153/0x250 [819f89a4] rt_spin_lock+0x24/0x60 [810ccdd6] __wake_up+0x36/0x70 [81003bbb] kvm_vcpu_kick+0x3b/0xd0 -rt lacks an atomic waitqueue for triggering VCPU wakeups on MSIs from assigned devices directly from the host IRQ handler. We need to disable this fast-path in -rt or introduce such an abstraction (I did this once over 2.6.33-rt). Ah, right -- the simple wait queue support (currently -rt specific) would have to be used here. It is on the todo list to get that moved from -rt into mainline. Paul. -- IIRC, VFIO goes the slower patch via a kernel thread unconditionally, thus cannot trigger this. Only legacy device assignment is affected. Jan [810371a2] __apic_accept_irq+0x2b2/0x3a0 [810385f7] kvm_apic_set_irq+0x27/0x30 [8103894e] kvm_irq_delivery_to_apic_fast+0x1ae/0x3d0 [81038800] ? kvm_irq_delivery_to_apic_fast+0x60/0x3d0 [81009a8b] kvm_set_irq_inatomic+0x12b/0x4a0 [8100998a] ? kvm_set_irq_inatomic+0x2a/0x4a0 [8100c5b3] kvm_assigned_dev_msi+0x23/0x40 [8113cb38] handle_irq_event_percpu+0x88/0x3d0 [810ebb7c] ? cpu_startup_entry+0x19c/0x430 [8113cec8] handle_irq_event+0x48/0x70 [8113f9b7] handle_edge_irq+0x77/0x120 [8104c6ae] handle_irq+0x1e/0x30 [81a035ca] do_IRQ+0x5a/0xd0 [819f9a2f] common_interrupt+0x6f/0x6f EOI [819f9ae0] ? retint_restore_args+0xe/0xe [810ebb7c] ? cpu_startup_entry+0x19c/0x430 [810ebb38] ? cpu_startup_entry+0x158/0x430 [819db767] rest_init+0x137/0x140 [819db635] ? rest_init+0x5/0x140 [822fde18] start_kernel+0x3af/0x3bc [822fd870] ? repair_env_string+0x5e/0x5e [822fd5a5] x86_64_start_reservations+0x2a/0x2c [822fd673] x86_64_start_kernel+0xcc/0xcf = [ INFO: inconsistent lock state ] 3.10.10-rt7 #2 Not tainted - inconsistent {HARDIRQ-ON-W} - {IN-HARDIRQ-W} usage. swapper/0/0 [HC1[1]:SC0[0]:HE0:SE1] takes: (((q-lock)-lock)-wait_lock){?.+.-.}, at: [819f7e98] rt_spin_lock_slowlock+0x48/0x370 {HARDIRQ-ON-W} state was registered at: [810fc94d] __lock_acquire+0x69d/0x20e0 [810feaee] lock_acquire+0x9e/0x1f0 [819f9090] _raw_spin_lock+0x40/0x80
Re: [PATCH 5/6 v5] kvm: booke: clear host tlb reference flag on guest tlb invalidation
On Fri, 2013-09-20 at 13:04 -0500, Bhushan Bharat-R65777 wrote: -Original Message- From: Wood Scott-B07421 Sent: Friday, September 20, 2013 9:48 PM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; b...@kernel.crashing.org; ag...@suse.de; pau...@samba.org; kvm@vger.kernel.org; kvm-...@vger.kernel.org; linuxppc- d...@lists.ozlabs.org Subject: Re: [PATCH 5/6 v5] kvm: booke: clear host tlb reference flag on guest tlb invalidation On Thu, 2013-09-19 at 23:19 -0500, Bhushan Bharat-R65777 wrote: We uses these bit flags only for TLB1 and if size of stlbe is 4K then we set E500_TLB_TLB0 otherwise we set E500_TLB_BITMAP. Although I think that E500_TLB_BITMAP should be set only if stlbe size is less than gtlbe size. Why? Even if there's only one bit set in the map, we need it to keep track of which entry was used. If there is one entry then will not this be simple/faster to not lookup bitmap and guest-host array? A flag indicate it is 1:1 map and this is physical address. The difference would be negligible, and you'd have added overhead (both runtime and complexity) of making this a special case. -Scott -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/3] KVM: Make kvm_lock non-raw
On 2013-09-20 20:18, Paul Gortmaker wrote: On 13-09-20 02:04 PM, Jan Kiszka wrote: On 2013-09-20 19:51, Paul Gortmaker wrote: [Re: [PATCH 0/3] KVM: Make kvm_lock non-raw] On 16/09/2013 (Mon 18:12) Paul Gortmaker wrote: On 13-09-16 10:06 AM, Paolo Bonzini wrote: Paul Gortmaker reported a BUG on preempt-rt kernels, due to taking the mmu_lock within the raw kvm_lock in mmu_shrink_scan. He provided a patch that shrunk the kvm_lock critical section so that the mmu_lock critical section does not nest with it, but in the end there is no reason for the vm_list to be protected by a raw spinlock. Only manipulations of kvm_usage_count and the consequent hardware_enable/disable operations are not preemptable. This small series thus splits the kvm_lock in the raw part and the non-raw part. Paul, could you please provide your Tested-by? Sure, I'll go back and see if I can find what triggered it in the original report, and give the patches a spin on 3.4.x-rt (and probably 3.10.x-rt, since that is where rt-current is presently). Seems fine on 3.4-rt. On 3.10.10-rt7 it looks like there are other issues, probably not explicitly related to this patchset (see below). Paul. -- e1000e :00:19.0 eth1: removed PHC assign device 0:0:19.0 pci :00:19.0: irq 43 for MSI/MSI-X pci :00:19.0: irq 43 for MSI/MSI-X pci :00:19.0: irq 43 for MSI/MSI-X pci :00:19.0: irq 43 for MSI/MSI-X BUG: sleeping function called from invalid context at /home/paul/git/linux-rt/kernel/rtmutex.c:659 in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/0 2 locks held by swapper/0/0: #0: (rcu_read_lock){.+.+.+}, at: [8100998a] kvm_set_irq_inatomic+0x2a/0x4a0 #1: (rcu_read_lock){.+.+.+}, at: [81038800] kvm_irq_delivery_to_apic_fast+0x60/0x3d0 irq event stamp: 6121390 hardirqs last enabled at (6121389): [819f9ae0] restore_args+0x0/0x30 hardirqs last disabled at (6121390): [819f9a2a] common_interrupt+0x6a/0x6f softirqs last enabled at (0): [ (null)] (null) softirqs last disabled at (0): [ (null)] (null) Preemption disabled at:[810ebb9a] cpu_startup_entry+0x1ba/0x430 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.10.10-rt7 #2 Hardware name: Dell Inc. OptiPlex 990/0VNP2H, BIOS A17 03/14/2013 8201c440 880223603cf0 819f177d 880223603d18 810c90d3 880214a50110 0001 0001 880223603d38 819f89a4 880214a50110 880214a50110 Call Trace: IRQ [819f177d] dump_stack+0x19/0x1b [810c90d3] __might_sleep+0x153/0x250 [819f89a4] rt_spin_lock+0x24/0x60 [810ccdd6] __wake_up+0x36/0x70 [81003bbb] kvm_vcpu_kick+0x3b/0xd0 -rt lacks an atomic waitqueue for triggering VCPU wakeups on MSIs from assigned devices directly from the host IRQ handler. We need to disable this fast-path in -rt or introduce such an abstraction (I did this once over 2.6.33-rt). Ah, right -- the simple wait queue support (currently -rt specific) would have to be used here. It is on the todo list to get that moved from -rt into mainline. Oh, it's there in -rt already - perfect! If there is a good reason for upstream, kvm can switch of course. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/3] KVM: Make kvm_lock non-raw
On 2013-09-20 19:51, Paul Gortmaker wrote: [Re: [PATCH 0/3] KVM: Make kvm_lock non-raw] On 16/09/2013 (Mon 18:12) Paul Gortmaker wrote: On 13-09-16 10:06 AM, Paolo Bonzini wrote: Paul Gortmaker reported a BUG on preempt-rt kernels, due to taking the mmu_lock within the raw kvm_lock in mmu_shrink_scan. He provided a patch that shrunk the kvm_lock critical section so that the mmu_lock critical section does not nest with it, but in the end there is no reason for the vm_list to be protected by a raw spinlock. Only manipulations of kvm_usage_count and the consequent hardware_enable/disable operations are not preemptable. This small series thus splits the kvm_lock in the raw part and the non-raw part. Paul, could you please provide your Tested-by? Sure, I'll go back and see if I can find what triggered it in the original report, and give the patches a spin on 3.4.x-rt (and probably 3.10.x-rt, since that is where rt-current is presently). Seems fine on 3.4-rt. On 3.10.10-rt7 it looks like there are other issues, probably not explicitly related to this patchset (see below). Paul. -- e1000e :00:19.0 eth1: removed PHC assign device 0:0:19.0 pci :00:19.0: irq 43 for MSI/MSI-X pci :00:19.0: irq 43 for MSI/MSI-X pci :00:19.0: irq 43 for MSI/MSI-X pci :00:19.0: irq 43 for MSI/MSI-X BUG: sleeping function called from invalid context at /home/paul/git/linux-rt/kernel/rtmutex.c:659 in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/0 2 locks held by swapper/0/0: #0: (rcu_read_lock){.+.+.+}, at: [8100998a] kvm_set_irq_inatomic+0x2a/0x4a0 #1: (rcu_read_lock){.+.+.+}, at: [81038800] kvm_irq_delivery_to_apic_fast+0x60/0x3d0 irq event stamp: 6121390 hardirqs last enabled at (6121389): [819f9ae0] restore_args+0x0/0x30 hardirqs last disabled at (6121390): [819f9a2a] common_interrupt+0x6a/0x6f softirqs last enabled at (0): [ (null)] (null) softirqs last disabled at (0): [ (null)] (null) Preemption disabled at:[810ebb9a] cpu_startup_entry+0x1ba/0x430 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.10.10-rt7 #2 Hardware name: Dell Inc. OptiPlex 990/0VNP2H, BIOS A17 03/14/2013 8201c440 880223603cf0 819f177d 880223603d18 810c90d3 880214a50110 0001 0001 880223603d38 819f89a4 880214a50110 880214a50110 Call Trace: IRQ [819f177d] dump_stack+0x19/0x1b [810c90d3] __might_sleep+0x153/0x250 [819f89a4] rt_spin_lock+0x24/0x60 [810ccdd6] __wake_up+0x36/0x70 [81003bbb] kvm_vcpu_kick+0x3b/0xd0 -rt lacks an atomic waitqueue for triggering VCPU wakeups on MSIs from assigned devices directly from the host IRQ handler. We need to disable this fast-path in -rt or introduce such an abstraction (I did this once over 2.6.33-rt). IIRC, VFIO goes the slower patch via a kernel thread unconditionally, thus cannot trigger this. Only legacy device assignment is affected. Jan [810371a2] __apic_accept_irq+0x2b2/0x3a0 [810385f7] kvm_apic_set_irq+0x27/0x30 [8103894e] kvm_irq_delivery_to_apic_fast+0x1ae/0x3d0 [81038800] ? kvm_irq_delivery_to_apic_fast+0x60/0x3d0 [81009a8b] kvm_set_irq_inatomic+0x12b/0x4a0 [8100998a] ? kvm_set_irq_inatomic+0x2a/0x4a0 [8100c5b3] kvm_assigned_dev_msi+0x23/0x40 [8113cb38] handle_irq_event_percpu+0x88/0x3d0 [810ebb7c] ? cpu_startup_entry+0x19c/0x430 [8113cec8] handle_irq_event+0x48/0x70 [8113f9b7] handle_edge_irq+0x77/0x120 [8104c6ae] handle_irq+0x1e/0x30 [81a035ca] do_IRQ+0x5a/0xd0 [819f9a2f] common_interrupt+0x6f/0x6f EOI [819f9ae0] ? retint_restore_args+0xe/0xe [810ebb7c] ? cpu_startup_entry+0x19c/0x430 [810ebb38] ? cpu_startup_entry+0x158/0x430 [819db767] rest_init+0x137/0x140 [819db635] ? rest_init+0x5/0x140 [822fde18] start_kernel+0x3af/0x3bc [822fd870] ? repair_env_string+0x5e/0x5e [822fd5a5] x86_64_start_reservations+0x2a/0x2c [822fd673] x86_64_start_kernel+0xcc/0xcf = [ INFO: inconsistent lock state ] 3.10.10-rt7 #2 Not tainted - inconsistent {HARDIRQ-ON-W} - {IN-HARDIRQ-W} usage. swapper/0/0 [HC1[1]:SC0[0]:HE0:SE1] takes: (((q-lock)-lock)-wait_lock){?.+.-.}, at: [819f7e98] rt_spin_lock_slowlock+0x48/0x370 {HARDIRQ-ON-W} state was registered at: [810fc94d] __lock_acquire+0x69d/0x20e0 [810feaee] lock_acquire+0x9e/0x1f0 [819f9090] _raw_spin_lock+0x40/0x80 [819f7e98] rt_spin_lock_slowlock+0x48/0x370 [819f89ac] rt_spin_lock+0x2c/0x60 [810ccdd6] __wake_up+0x36/0x70 [8109c5ce] run_timer_softirq+0x1be/0x390 [81092a09]
Re: [PATCH 5/6 v6] kvm: booke: clear host tlb reference flag on guest tlb invalidation
On Fri, 2013-09-20 at 09:55 +0530, Bharat Bhushan wrote: On booke, struct tlbe_ref contains host tlb mapping information (pfn: for guest-pfn to pfn, flags: attribute associated with this mapping) for a guest tlb entry. So when a guest creates a TLB entry then struct tlbe_ref is set to point to valid pfn and set attributes in flags field of the above said structure. When a guest TLB entry is invalidated then flags field of corresponding struct tlbe_ref is updated to point that this is no more valid, also we selectively clear some other attribute bits, example: if E500_TLB_BITMAP was set then we clear E500_TLB_BITMAP, if E500_TLB_TLB0 is set then we clear this. Ideally we should clear complete flags as this entry is invalid and does not have anything to re-used. The other part of the problem is that when we use the same entry again then also we do not clear (started doing or-ing etc). So far it was working because the selectively clearing mentioned above actually clears flags what was set during TLB mapping. But the problem starts coming when we add more attributes to this then we need to selectively clear them and which is not needed. This patch we do both - Clear flags when invalidating; - Clear flags when reusing same entry later Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v5-v6 - Fix flag clearing comment The changes between v5 and v6 are not just about comments... arch/powerpc/kvm/e500_mmu_host.c | 16 1 files changed, 8 insertions(+), 8 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..7370e1c 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -230,15 +230,15 @@ void inval_gtlbe_on_host(struct kvmppc_vcpu_e500 *vcpu_e500, int tlbsel, ref-flags = ~(E500_TLB_TLB0 | E500_TLB_VALID); } - /* Already invalidated in between */ - if (!(ref-flags E500_TLB_VALID)) - return; - - /* Guest tlbe is backed by at most one host tlbe per shadow pid. */ - kvmppc_e500_tlbil_one(vcpu_e500, gtlbe); + /* + * Check whether TLB entry is already invalidated in between + * Guest tlbe is backed by at most one host tlbe per shadow pid. + */ + if (ref-flags E500_TLB_VALID) + kvmppc_e500_tlbil_one(vcpu_e500, gtlbe); I'd phrase this combined comment as If it's still valid, it's a TLB0 entry, and thus backed by at most one host tlbe per shadow pid. Otherwise looks good. -Scott -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] KVM: PPC: Book3S HV: Fix typo in saving DSCR
This fixes a typo in the code that saves the guest DSCR (Data Stream Control Register) into the kvm_vcpu_arch struct on guest exit. The effect of the typo was that the DSCR value was saved in the wrong place, so changes to the DSCR by the guest didn't persist across guest exit and entry, and some host kernel memory got corrupted. Cc: sta...@vger.kernel.org [v3.1+] Signed-off-by: Paul Mackerras pau...@samba.org --- Please send this upstream to Linus for inclusion in 3.12. arch/powerpc/kvm/book3s_hv_rmhandlers.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S index 8e0f28f..852e694 100644 --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S @@ -1190,7 +1190,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_206) BEGIN_FTR_SECTION mfspr r8, SPRN_DSCR ld r7, HSTATE_DSCR(r13) - std r8, VCPU_DSCR(r7) + std r8, VCPU_DSCR(r9) mtspr SPRN_DSCR, r7 END_FTR_SECTION_IFSET(CPU_FTR_ARCH_206) -- 1.8.4.rc3 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 04/18] KVM: PPC: Book3S HV: Support POWER6 compatibility mode on POWER7
On Fri, Sep 20, 2013 at 02:52:40PM +1000, Paul Mackerras wrote: @@ -536,6 +536,9 @@ struct kvm_get_htab_header { #define KVM_REG_PPC_LPCR (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0xb5) #define KVM_REG_PPC_PPR (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xb6) +/* Architecture compatibility level */ +#define KVM_REG_PPC_ARCH_COMPAT (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0xb6) Clearly this was meant to be b7, not b6. Will repost. Paul. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 04/18] KVM: PPC: Book3S HV: Support POWER6 compatibility mode on POWER7
This enables us to use the Processor Compatibility Register (PCR) on POWER7 to put the processor into architecture 2.05 compatibility mode when running a guest. In this mode the new instructions and registers that were introduced on POWER7 are disabled in user mode. This includes all the VSX facilities plus several other instructions such as ldbrx, stdbrx, popcntw, popcntd, etc. To select this mode, we have a new register accessible through the set/get_one_reg interface, called KVM_REG_PPC_ARCH_COMPAT. Setting this to zero gives the full set of capabilities of the processor. Setting it to one of the logical PVR values defined in PAPR puts the vcpu into the compatibility mode for the corresponding architecture level. The supported values are: 0x0f02 Architecture 2.05 (POWER6) 0x0f03 Architecture 2.06 (POWER7) 0x0f13 Architecture 2.06+ (POWER7+) Since the PCR is per-core, the architecture compatibility level and the corresponding PCR value are stored in the struct kvmppc_vcore, and are therefore shared between all vcpus in a virtual core. Signed-off-by: Paul Mackerras pau...@samba.org --- v2: Use correct value for one_reg identifier Documentation/virtual/kvm/api.txt | 1 + arch/powerpc/include/asm/kvm_host.h | 2 ++ arch/powerpc/include/asm/reg.h | 11 +++ arch/powerpc/include/uapi/asm/kvm.h | 3 +++ arch/powerpc/kernel/asm-offsets.c | 1 + arch/powerpc/kvm/book3s_hv.c| 35 + arch/powerpc/kvm/book3s_hv_rmhandlers.S | 16 +-- 7 files changed, 67 insertions(+), 2 deletions(-) diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 34a32b6..f1f300f 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -1837,6 +1837,7 @@ registers, find a list below: PPC | KVM_REG_PPC_VRSAVE | 32 PPC | KVM_REG_PPC_LPCR | 64 PPC | KVM_REG_PPC_PPR | 64 + PPC | KVM_REG_PPC_ARCH_COMPAT | 32 PPC | KVM_REG_PPC_TM_GPR0 | 64 ... PPC | KVM_REG_PPC_TM_GPR31 | 64 diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 8bd730c..82daa12 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -296,6 +296,8 @@ struct kvmppc_vcore { struct kvm_vcpu *runner; u64 tb_offset; /* guest timebase - host timebase */ ulong lpcr; + u32 arch_compat; + ulong pcr; }; #define VCORE_ENTRY_COUNT(vc) ((vc)-entry_exit_count 0xff) diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h index ed98ebf..1afa20c 100644 --- a/arch/powerpc/include/asm/reg.h +++ b/arch/powerpc/include/asm/reg.h @@ -305,6 +305,10 @@ #define LPID_RSVD0x3ff /* Reserved LPID for partn switching */ #defineSPRN_HMER 0x150 /* Hardware m? error recovery */ #defineSPRN_HMEER 0x151 /* Hardware m? enable error recovery */ +#define SPRN_PCR 0x152 /* Processor compatibility register */ +#define PCR_VEC_DIS (1ul (63-0)) /* Vec. disable (pre POWER8) */ +#define PCR_VSX_DIS (1ul (63-1)) /* VSX disable (pre POWER8) */ +#define PCR_ARCH_205 0x2 /* Architecture 2.05 */ #defineSPRN_HEIR 0x153 /* Hypervisor Emulated Instruction Register */ #define SPRN_TLBINDEXR 0x154 /* P7 TLB control register */ #define SPRN_TLBVPNR 0x155 /* P7 TLB control register */ @@ -1096,6 +1100,13 @@ #define PVR_BE 0x0070 #define PVR_PA6T 0x0090 +/* Logical PVR values defined in PAPR, representing architecture levels */ +#define PVR_ARCH_204 0x0f01 +#define PVR_ARCH_205 0x0f02 +#define PVR_ARCH_206 0x0f03 +#define PVR_ARCH_206p 0x0f13 +#define PVR_ARCH_207 0x0f04 + /* Macros for setting and retrieving special purpose registers */ #ifndef __ASSEMBLY__ #define mfmsr()({unsigned long rval; \ diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h index fab6bc1..62c4323 100644 --- a/arch/powerpc/include/uapi/asm/kvm.h +++ b/arch/powerpc/include/uapi/asm/kvm.h @@ -536,6 +536,9 @@ struct kvm_get_htab_header { #define KVM_REG_PPC_LPCR (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0xb5) #define KVM_REG_PPC_PPR(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xb6) +/* Architecture compatibility level */ +#define KVM_REG_PPC_ARCH_COMPAT(KVM_REG_PPC | KVM_REG_SIZE_U32 | 0xb7) + /* Transactional Memory checkpointed state: * This is all GPRs, all VSX regs and a subset of SPRs */ diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c index 830193b..7f717f2 100644 --- a/arch/powerpc/kernel/asm-offsets.c +++ b/arch/powerpc/kernel/asm-offsets.c @@ -523,6 +523,7 @@ int main(void) DEFINE(VCORE_NAPPING_THREADS, offsetof(struct kvmppc_vcore, napping_threads)); DEFINE(VCORE_TB_OFFSET,
RE: [PATCH 5/6 v5] kvm: booke: clear host tlb reference flag on guest tlb invalidation
-Original Message- From: Wood Scott-B07421 Sent: Friday, September 20, 2013 11:38 PM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; b...@kernel.crashing.org; ag...@suse.de; pau...@samba.org; k...@vger.kernel.org; kvm-ppc@vger.kernel.org; linuxppc- d...@lists.ozlabs.org Subject: Re: [PATCH 5/6 v5] kvm: booke: clear host tlb reference flag on guest tlb invalidation On Fri, 2013-09-20 at 13:04 -0500, Bhushan Bharat-R65777 wrote: -Original Message- From: Wood Scott-B07421 Sent: Friday, September 20, 2013 9:48 PM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; b...@kernel.crashing.org; ag...@suse.de; pau...@samba.org; k...@vger.kernel.org; kvm-ppc@vger.kernel.org; linuxppc- d...@lists.ozlabs.org Subject: Re: [PATCH 5/6 v5] kvm: booke: clear host tlb reference flag on guest tlb invalidation On Thu, 2013-09-19 at 23:19 -0500, Bhushan Bharat-R65777 wrote: We uses these bit flags only for TLB1 and if size of stlbe is 4K then we set E500_TLB_TLB0 otherwise we set E500_TLB_BITMAP. Although I think that E500_TLB_BITMAP should be set only if stlbe size is less than gtlbe size. Why? Even if there's only one bit set in the map, we need it to keep track of which entry was used. If there is one entry then will not this be simple/faster to not lookup bitmap and guest-host array? A flag indicate it is 1:1 map and this is physical address. The difference would be negligible, and you'd have added overhead (both runtime and complexity) of making this a special case. May be you are right , I will see if I can give a try :) BTW I have already sent v6 of this patch. -Bharat -Scott
RE: [PATCH 5/6 v5] kvm: booke: clear host tlb reference flag on guest tlb invalidation
-Original Message- From: Wood Scott-B07421 Sent: Friday, September 20, 2013 9:48 PM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; b...@kernel.crashing.org; ag...@suse.de; pau...@samba.org; k...@vger.kernel.org; kvm-ppc@vger.kernel.org; linuxppc- d...@lists.ozlabs.org Subject: Re: [PATCH 5/6 v5] kvm: booke: clear host tlb reference flag on guest tlb invalidation On Thu, 2013-09-19 at 23:19 -0500, Bhushan Bharat-R65777 wrote: -Original Message- From: Wood Scott-B07421 Sent: Friday, September 20, 2013 2:38 AM To: Bhushan Bharat-R65777 Cc: b...@kernel.crashing.org; ag...@suse.de; pau...@samba.org; k...@vger.kernel.org; kvm-ppc@vger.kernel.org; linuxppc-...@lists.ozlabs.org; Bhushan Bharat-R65777 Subject: Re: [PATCH 5/6 v5] kvm: booke: clear host tlb reference flag on guest tlb invalidation This breaks when you have both E500_TLB_BITMAP and E500_TLB_TLB0 set. I do not see any case where we set both E500_TLB_BITMAP and E500_TLB_TLB0. This would happen if you have a guest TLB1 entry that is backed by some 4K pages and some larger pages (e.g. if the guest maps CCSR with one big TLB1 and there are varying I/O passthrough regions mapped). It's not common, but it's possible. Agree Also we have not optimized that yet (keeping track of multiple shadow TLB0 entries for one guest TLB1 entry) This is about correctness, not optimization. We uses these bit flags only for TLB1 and if size of stlbe is 4K then we set E500_TLB_TLB0 otherwise we set E500_TLB_BITMAP. Although I think that E500_TLB_BITMAP should be set only if stlbe size is less than gtlbe size. Why? Even if there's only one bit set in the map, we need it to keep track of which entry was used. If there is one entry then will not this be simple/faster to not lookup bitmap and guest-host array? A flag indicate it is 1:1 map and this is physical address. -Bharat -Scott N�r��yb�X��ǧv�^�){.n�+jir)w*jg����ݢj/���z�ޖ��2�ޙ�)ߡ�a�����G���h��j:+v���w��٥
Re: [PATCH 5/6 v5] kvm: booke: clear host tlb reference flag on guest tlb invalidation
On Fri, 2013-09-20 at 13:04 -0500, Bhushan Bharat-R65777 wrote: -Original Message- From: Wood Scott-B07421 Sent: Friday, September 20, 2013 9:48 PM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; b...@kernel.crashing.org; ag...@suse.de; pau...@samba.org; k...@vger.kernel.org; kvm-ppc@vger.kernel.org; linuxppc- d...@lists.ozlabs.org Subject: Re: [PATCH 5/6 v5] kvm: booke: clear host tlb reference flag on guest tlb invalidation On Thu, 2013-09-19 at 23:19 -0500, Bhushan Bharat-R65777 wrote: We uses these bit flags only for TLB1 and if size of stlbe is 4K then we set E500_TLB_TLB0 otherwise we set E500_TLB_BITMAP. Although I think that E500_TLB_BITMAP should be set only if stlbe size is less than gtlbe size. Why? Even if there's only one bit set in the map, we need it to keep track of which entry was used. If there is one entry then will not this be simple/faster to not lookup bitmap and guest-host array? A flag indicate it is 1:1 map and this is physical address. The difference would be negligible, and you'd have added overhead (both runtime and complexity) of making this a special case. -Scott -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 04/18] KVM: PPC: Book3S HV: Support POWER6 compatibility mode on POWER7
On Fri, Sep 20, 2013 at 02:52:40PM +1000, Paul Mackerras wrote: @@ -536,6 +536,9 @@ struct kvm_get_htab_header { #define KVM_REG_PPC_LPCR (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0xb5) #define KVM_REG_PPC_PPR (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xb6) +/* Architecture compatibility level */ +#define KVM_REG_PPC_ARCH_COMPAT (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0xb6) Clearly this was meant to be b7, not b6. Will repost. Paul. -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 04/18] KVM: PPC: Book3S HV: Support POWER6 compatibility mode on POWER7
This enables us to use the Processor Compatibility Register (PCR) on POWER7 to put the processor into architecture 2.05 compatibility mode when running a guest. In this mode the new instructions and registers that were introduced on POWER7 are disabled in user mode. This includes all the VSX facilities plus several other instructions such as ldbrx, stdbrx, popcntw, popcntd, etc. To select this mode, we have a new register accessible through the set/get_one_reg interface, called KVM_REG_PPC_ARCH_COMPAT. Setting this to zero gives the full set of capabilities of the processor. Setting it to one of the logical PVR values defined in PAPR puts the vcpu into the compatibility mode for the corresponding architecture level. The supported values are: 0x0f02 Architecture 2.05 (POWER6) 0x0f03 Architecture 2.06 (POWER7) 0x0f13 Architecture 2.06+ (POWER7+) Since the PCR is per-core, the architecture compatibility level and the corresponding PCR value are stored in the struct kvmppc_vcore, and are therefore shared between all vcpus in a virtual core. Signed-off-by: Paul Mackerras pau...@samba.org --- v2: Use correct value for one_reg identifier Documentation/virtual/kvm/api.txt | 1 + arch/powerpc/include/asm/kvm_host.h | 2 ++ arch/powerpc/include/asm/reg.h | 11 +++ arch/powerpc/include/uapi/asm/kvm.h | 3 +++ arch/powerpc/kernel/asm-offsets.c | 1 + arch/powerpc/kvm/book3s_hv.c| 35 + arch/powerpc/kvm/book3s_hv_rmhandlers.S | 16 +-- 7 files changed, 67 insertions(+), 2 deletions(-) diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 34a32b6..f1f300f 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -1837,6 +1837,7 @@ registers, find a list below: PPC | KVM_REG_PPC_VRSAVE | 32 PPC | KVM_REG_PPC_LPCR | 64 PPC | KVM_REG_PPC_PPR | 64 + PPC | KVM_REG_PPC_ARCH_COMPAT | 32 PPC | KVM_REG_PPC_TM_GPR0 | 64 ... PPC | KVM_REG_PPC_TM_GPR31 | 64 diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 8bd730c..82daa12 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -296,6 +296,8 @@ struct kvmppc_vcore { struct kvm_vcpu *runner; u64 tb_offset; /* guest timebase - host timebase */ ulong lpcr; + u32 arch_compat; + ulong pcr; }; #define VCORE_ENTRY_COUNT(vc) ((vc)-entry_exit_count 0xff) diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h index ed98ebf..1afa20c 100644 --- a/arch/powerpc/include/asm/reg.h +++ b/arch/powerpc/include/asm/reg.h @@ -305,6 +305,10 @@ #define LPID_RSVD0x3ff /* Reserved LPID for partn switching */ #defineSPRN_HMER 0x150 /* Hardware m? error recovery */ #defineSPRN_HMEER 0x151 /* Hardware m? enable error recovery */ +#define SPRN_PCR 0x152 /* Processor compatibility register */ +#define PCR_VEC_DIS (1ul (63-0)) /* Vec. disable (pre POWER8) */ +#define PCR_VSX_DIS (1ul (63-1)) /* VSX disable (pre POWER8) */ +#define PCR_ARCH_205 0x2 /* Architecture 2.05 */ #defineSPRN_HEIR 0x153 /* Hypervisor Emulated Instruction Register */ #define SPRN_TLBINDEXR 0x154 /* P7 TLB control register */ #define SPRN_TLBVPNR 0x155 /* P7 TLB control register */ @@ -1096,6 +1100,13 @@ #define PVR_BE 0x0070 #define PVR_PA6T 0x0090 +/* Logical PVR values defined in PAPR, representing architecture levels */ +#define PVR_ARCH_204 0x0f01 +#define PVR_ARCH_205 0x0f02 +#define PVR_ARCH_206 0x0f03 +#define PVR_ARCH_206p 0x0f13 +#define PVR_ARCH_207 0x0f04 + /* Macros for setting and retrieving special purpose registers */ #ifndef __ASSEMBLY__ #define mfmsr()({unsigned long rval; \ diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h index fab6bc1..62c4323 100644 --- a/arch/powerpc/include/uapi/asm/kvm.h +++ b/arch/powerpc/include/uapi/asm/kvm.h @@ -536,6 +536,9 @@ struct kvm_get_htab_header { #define KVM_REG_PPC_LPCR (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0xb5) #define KVM_REG_PPC_PPR(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xb6) +/* Architecture compatibility level */ +#define KVM_REG_PPC_ARCH_COMPAT(KVM_REG_PPC | KVM_REG_SIZE_U32 | 0xb7) + /* Transactional Memory checkpointed state: * This is all GPRs, all VSX regs and a subset of SPRs */ diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c index 830193b..7f717f2 100644 --- a/arch/powerpc/kernel/asm-offsets.c +++ b/arch/powerpc/kernel/asm-offsets.c @@ -523,6 +523,7 @@ int main(void) DEFINE(VCORE_NAPPING_THREADS, offsetof(struct kvmppc_vcore, napping_threads)); DEFINE(VCORE_TB_OFFSET,