Re: error: building kvm for powerpc using cross compiler on x86
On Thu, Jan 6, 2011 at 10:39 AM, Dushyant Bansal cs5070...@cse.iitd.ac.in wrote: I want to build kvm kernel module for powerpc using cross compiler. I downloaded kernel source code from http://www.linux-kvm.org/page/PowerPC_440_Host_Kernel. Inside that kvm_source folder when I do make ARCH=powerpc menuconfig, inside virtualization menu, option kvm is not present. I have attached the screenshot. It's impossible to say from your description, but I would guess that you haven't configured for a 440 processor before visiting the Virtualization menu. -Hollis -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: re-writing on powerpc
On 12/14/2010 12:48 AM, Avi Kivity wrote: On 12/13/2010 07:17 PM, Hollis Blanchard wrote: Rewriting is dangerous if the guest is unaware of it. As soon as it is made aware of it, it might as well actually do it in the best way that suits it. Can you list some examples of dangerous scenarios? Perhaps I should rephrase... any real-world dangerous scenarios? :) I was hoping you could share some traps you've hit with Linux or Windows on x86. - guest checksums own kernel pages For runtime intrusion detection? Such guests can simply not ask the hypervisor to enable the rewriting feature. - clever compiler reuses code for constant pool Not sure what you mean here. Anyways I think clever compilers are irrelevant, since a compiler will not ordinarily emit a supervisor-mode instruction. The hypervisor has no need to patch normal user-mode instructions. - guest patches itself (a la linux alternatives), surprised when it sees a different instruction PowerPC Linux does patch itself, which is a write-only operation. - guest jits own kernel code (like Singularity), gets confused when it reads back something it didn't write This is getting really hypothetical, but why would a JIT need to read the generated code? Hollis Blanchard Mentor Graphics, Embedded Systems Division -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] kvm/e500v2: MMU optimization
On 09/09/2010 04:03 AM, Liu Yu-B13201 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Hollis Blanchard Sent: Thursday, September 09, 2010 12:07 AM To: Liu Yu-B13201 Cc: kvm@vger.kernel.org; kvm-...@vger.kernel.org; ag...@suse.de Subject: Re: [PATCH 0/2] kvm/e500v2: MMU optimization On 09/08/2010 02:40 AM, Liu Yu wrote: The patchset aims at mapping guest TLB1 to host TLB0. And it includes: [PATCH 1/2] kvm/e500v2: Remove shadow tlb [PATCH 2/2] kvm/e500v2: mapping guest TLB1 to host TLB0 The reason we need patch 1 is because patch 1 make things simple and flexible. Only applying patch 1 aslo make kvm work. I've always thought the best long-term optimization on these cores is to share in the host PID allocation (i.e. __init_new_context()). This way, the TID in guest mappings would not overlap the TID in host mappings, and guest mappings could be demand-faulted rather than swapped wholesale. To do that, you would need to track the host PID in KVM data structures, I guess in the tlbe_ref structure. Hi Hollis, Guest uses AS=1 and host uses AS=0, so even guest uses the same TID with host, they're in different space. Then why guest needs to care about host TID? You're absolutely right, but this makes a couple key assumptions: 1. The guest doesn't try to use AS=1. This is already false in Linux, because the udbg code uses an AS=1 mapping for the UART, but this can be configured out (with a small loss in functionality). In non-Linux guests the AS=0 restriction could be onerous. 2. A Book E MMU. If we participate in the host MMU context allocation, the guest - host address space code could be generalized to include e.g. an e600 guest on an e500 host, or vice versa. So you're right that optimization is not the right word; this would be more of a functionality and design improvement. (In fact, I suppose it could reduce performance on Book E, since AS=1 space actually *is* unused by the host. I think it would be worth finding out.) Hollis Blanchard Mentor Graphics, Embedded Systems Division -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] kvm/e500v2: Remove shadow tlb
On 09/09/2010 04:16 AM, Liu Yu-B13201 wrote: Yes, it's hard to resume TLB0. We only resume TLB1 in previous code. But TLB1 is even more smaller (13 free entries) than 440, So that it still has little possibility to get hit. thus the resumption is useless. The only reason hits are unlikely in TLB1 is because you still don't have large page support in the host. Once you have that, you can use TLB1 for large guest mappings, and it will become extremely likely that you get hits in TLB1. This is true even if the guest wants 256MB but the host supports only e.g. 16MB large pages, and must split the guest mapping into multiple large host pages. When will you have hugetlbfs for e500? That's going to make such a dramatic difference, I'm not sure it's worth investing time in optimizing the MMU code until then. Hollis Blanchard Mentor Graphics, Embedded Systems Division -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] kvm/e500v2: Remove shadow tlb
On 09/09/2010 04:26 PM, Alexander Graf wrote: On 09.09.2010, at 20:13, Hollis Blanchard wrote: On 09/09/2010 04:16 AM, Liu Yu-B13201 wrote: Yes, it's hard to resume TLB0. We only resume TLB1 in previous code. But TLB1 is even more smaller (13 free entries) than 440, So that it still has little possibility to get hit. thus the resumption is useless. The only reason hits are unlikely in TLB1 is because you still don't have large page support in the host. Once you have that, you can use TLB1 for large guest mappings, and it will become extremely likely that you get hits in TLB1. This is true even if the guest wants 256MB but the host supports only e.g. 16MB large pages, and must split the guest mapping into multiple large host pages. When will you have hugetlbfs for e500? That's going to make such a dramatic difference, I'm not sure it's worth investing time in optimizing the MMU code until then. I'm not sure I agree. Sure, huge pages give another big win, but the state as is should at least be fast enough for prototyping. Sure, and it sounds like you can prototype with it already. My point is that, in your 80-20 rule of optimization, the 20% is going to change radically once large page support is in place. Remember that the guest kernel is mapped with just a couple large pages. During guest Linux boot on 440, I think about half the boot time is spent TLB thrashing in the initcalls. Using TLB0 can ameliorate that for now, but why bother, since it doesn't help you towards the real solution? I'm not saying this shouldn't be committed, if that's how you interpreted my comments, but in my opinion there are more useful things to do than continuing to optimize a path that is going to disappear in the future. Once you *do* have hugetlbfs in the host, you're not going to want to use TLB0 for guest TLB1 mappings any more anyways. Hollis Blanchard Mentor Graphics, Embedded Systems Division -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] kvm/e500v2: MMU optimization
On 09/09/2010 04:03 AM, Liu Yu-B13201 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Hollis Blanchard Sent: Thursday, September 09, 2010 12:07 AM To: Liu Yu-B13201 Cc: k...@vger.kernel.org; kvm-ppc@vger.kernel.org; ag...@suse.de Subject: Re: [PATCH 0/2] kvm/e500v2: MMU optimization On 09/08/2010 02:40 AM, Liu Yu wrote: The patchset aims at mapping guest TLB1 to host TLB0. And it includes: [PATCH 1/2] kvm/e500v2: Remove shadow tlb [PATCH 2/2] kvm/e500v2: mapping guest TLB1 to host TLB0 The reason we need patch 1 is because patch 1 make things simple and flexible. Only applying patch 1 aslo make kvm work. I've always thought the best long-term optimization on these cores is to share in the host PID allocation (i.e. __init_new_context()). This way, the TID in guest mappings would not overlap the TID in host mappings, and guest mappings could be demand-faulted rather than swapped wholesale. To do that, you would need to track the host PID in KVM data structures, I guess in the tlbe_ref structure. Hi Hollis, Guest uses AS=1 and host uses AS=0, so even guest uses the same TID with host, they're in different space. Then why guest needs to care about host TID? You're absolutely right, but this makes a couple key assumptions: 1. The guest doesn't try to use AS=1. This is already false in Linux, because the udbg code uses an AS=1 mapping for the UART, but this can be configured out (with a small loss in functionality). In non-Linux guests the AS=0 restriction could be onerous. 2. A Book E MMU. If we participate in the host MMU context allocation, the guest - host address space code could be generalized to include e.g. an e600 guest on an e500 host, or vice versa. So you're right that optimization is not the right word; this would be more of a functionality and design improvement. (In fact, I suppose it could reduce performance on Book E, since AS=1 space actually *is* unused by the host. I think it would be worth finding out.) Hollis Blanchard Mentor Graphics, Embedded Systems Division -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] kvm/e500v2: Remove shadow tlb
On 09/09/2010 04:16 AM, Liu Yu-B13201 wrote: Yes, it's hard to resume TLB0. We only resume TLB1 in previous code. But TLB1 is even more smaller (13 free entries) than 440, So that it still has little possibility to get hit. thus the resumption is useless. The only reason hits are unlikely in TLB1 is because you still don't have large page support in the host. Once you have that, you can use TLB1 for large guest mappings, and it will become extremely likely that you get hits in TLB1. This is true even if the guest wants 256MB but the host supports only e.g. 16MB large pages, and must split the guest mapping into multiple large host pages. When will you have hugetlbfs for e500? That's going to make such a dramatic difference, I'm not sure it's worth investing time in optimizing the MMU code until then. Hollis Blanchard Mentor Graphics, Embedded Systems Division -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] kvm/e500v2: Remove shadow tlb
On 09/09/2010 04:26 PM, Alexander Graf wrote: On 09.09.2010, at 20:13, Hollis Blanchard wrote: On 09/09/2010 04:16 AM, Liu Yu-B13201 wrote: Yes, it's hard to resume TLB0. We only resume TLB1 in previous code. But TLB1 is even more smaller (13 free entries) than 440, So that it still has little possibility to get hit. thus the resumption is useless. The only reason hits are unlikely in TLB1 is because you still don't have large page support in the host. Once you have that, you can use TLB1 for large guest mappings, and it will become extremely likely that you get hits in TLB1. This is true even if the guest wants 256MB but the host supports only e.g. 16MB large pages, and must split the guest mapping into multiple large host pages. When will you have hugetlbfs for e500? That's going to make such a dramatic difference, I'm not sure it's worth investing time in optimizing the MMU code until then. I'm not sure I agree. Sure, huge pages give another big win, but the state as is should at least be fast enough for prototyping. Sure, and it sounds like you can prototype with it already. My point is that, in your 80-20 rule of optimization, the 20% is going to change radically once large page support is in place. Remember that the guest kernel is mapped with just a couple large pages. During guest Linux boot on 440, I think about half the boot time is spent TLB thrashing in the initcalls. Using TLB0 can ameliorate that for now, but why bother, since it doesn't help you towards the real solution? I'm not saying this shouldn't be committed, if that's how you interpreted my comments, but in my opinion there are more useful things to do than continuing to optimize a path that is going to disappear in the future. Once you *do* have hugetlbfs in the host, you're not going to want to use TLB0 for guest TLB1 mappings any more anyways. Hollis Blanchard Mentor Graphics, Embedded Systems Division -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] kvm/e500v2: Remove shadow tlb
On 09/08/2010 02:40 AM, Liu Yu wrote: It is unnecessary to keep shadow tlb. first, shadow tlb keep fixed value in shadow, which make things unflexible. second, remove shadow tlb can save a lot memory. This patch remove shadow tlb and caculate the shadow tlb entry value before we write it to hardware. Also we use new struct tlbe_ref to trace the relation between guest tlb entry and page. Did you look at the performance impact? Back in the day, we did essentially the same thing on 440. However, rather than discard the whole TLB when context switching away from the host (to be demand-faulted when the guest is resumed), we found a noticeable performance improvement by preserving a shadow TLB across context switches. We only use it in the vcpu_put/vcpu_load path. Of course, our TLB was much smaller (64 entries), so the use model may not be the same at all (e.g. it takes longer to restore a full guest TLB working set, but maybe it's not really possible to use all 1024 TLB0 entries in one timeslice anyways). -- Hollis Blanchard Mentor Graphics, Embedded Systems Division -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] kvm/e500v2: MMU optimization
On 09/08/2010 02:40 AM, Liu Yu wrote: The patchset aims at mapping guest TLB1 to host TLB0. And it includes: [PATCH 1/2] kvm/e500v2: Remove shadow tlb [PATCH 2/2] kvm/e500v2: mapping guest TLB1 to host TLB0 The reason we need patch 1 is because patch 1 make things simple and flexible. Only applying patch 1 aslo make kvm work. I've always thought the best long-term optimization on these cores is to share in the host PID allocation (i.e. __init_new_context()). This way, the TID in guest mappings would not overlap the TID in host mappings, and guest mappings could be demand-faulted rather than swapped wholesale. To do that, you would need to track the host PID in KVM data structures, I guess in the tlbe_ref structure. -- Hollis Blanchard Mentor Graphics, Embedded Systems Division -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] kvm/e500v2: Remove shadow tlb
On 09/08/2010 02:40 AM, Liu Yu wrote: It is unnecessary to keep shadow tlb. first, shadow tlb keep fixed value in shadow, which make things unflexible. second, remove shadow tlb can save a lot memory. This patch remove shadow tlb and caculate the shadow tlb entry value before we write it to hardware. Also we use new struct tlbe_ref to trace the relation between guest tlb entry and page. Did you look at the performance impact? Back in the day, we did essentially the same thing on 440. However, rather than discard the whole TLB when context switching away from the host (to be demand-faulted when the guest is resumed), we found a noticeable performance improvement by preserving a shadow TLB across context switches. We only use it in the vcpu_put/vcpu_load path. Of course, our TLB was much smaller (64 entries), so the use model may not be the same at all (e.g. it takes longer to restore a full guest TLB working set, but maybe it's not really possible to use all 1024 TLB0 entries in one timeslice anyways). -- Hollis Blanchard Mentor Graphics, Embedded Systems Division -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] kvm/e500v2: MMU optimization
On 09/08/2010 02:40 AM, Liu Yu wrote: The patchset aims at mapping guest TLB1 to host TLB0. And it includes: [PATCH 1/2] kvm/e500v2: Remove shadow tlb [PATCH 2/2] kvm/e500v2: mapping guest TLB1 to host TLB0 The reason we need patch 1 is because patch 1 make things simple and flexible. Only applying patch 1 aslo make kvm work. I've always thought the best long-term optimization on these cores is to share in the host PID allocation (i.e. __init_new_context()). This way, the TID in guest mappings would not overlap the TID in host mappings, and guest mappings could be demand-faulted rather than swapped wholesale. To do that, you would need to track the host PID in KVM data structures, I guess in the tlbe_ref structure. -- Hollis Blanchard Mentor Graphics, Embedded Systems Division -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] fix PowerPC 440 Bamboo platform emulation
On 08/04/2010 05:21 PM, Hollis Blanchard wrote: These patches get the PowerPC Bamboo platform working again. I've re-written two of the patches based on feedback from qemu-devel. Hi Aurelien, any comment on these? Hollis Blanchard Mentor Graphics, Embedded Systems Division -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/4] ppc4xx: load Bamboo kernel, initrd, and fdt at fixed addresses
On Thu, Aug 5, 2010 at 7:39 PM, Liu Yu-B13201 b13...@freescale.com wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Hollis Blanchard Sent: Thursday, August 05, 2010 8:22 AM To: qemu-de...@nongnu.org Cc: kvm-ppc@vger.kernel.org Subject: [PATCH 4/4] ppc4xx: load Bamboo kernel, initrd, and fdt at fixed addresses We can't use the return value of load_uimage() for the kernel because it can't account for BSS size, and the PowerPC kernel does not relocate blobs before zeroing BSS. Instead, we now load at the fixed addresses chosen by u-boot (the normal firmware for the board). What will us do if the uImage become bigger and fixed size is not enough? That was my question to Edgar, which was not answered. In u-boot, one would change some environment variables. With this code in qemu, the only recourse would be to edit ppc440_bamboo.c and rebuild. Perhaps in the future we can do something more sophisticated by specifying load addresses in a qemu device tree, but I don't understand that stuff enough to know if that is an intended use. -Hollis -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 17/27] KVM: PPC: KVM PV guest stubs
On 07/29/2010 05:47 AM, Alexander Graf wrote: We will soon start and replace instructions from the text section with other, paravirtualized versions. To ease the readability of those patches I split out the generic looping and magic page mapping code out. This patch still only contains stubs. But at least it loops through the text section :). Signed-off-by: Alexander Grafag...@suse.de --- v1 - v2: - kvm guest patch framework: introduce patch_ins v2 - v3: - add self-test in guest code - remove superfluous new lines in generic guest code --- arch/powerpc/kernel/kvm.c | 95 + 1 files changed, 95 insertions(+), 0 deletions(-) diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c index a5ece71..e93366f 100644 --- a/arch/powerpc/kernel/kvm.c +++ b/arch/powerpc/kernel/kvm.c @@ -33,6 +33,62 @@ #define KVM_MAGIC_PAGE(-4096L) #define magic_var(x) KVM_MAGIC_PAGE + offsetof(struct kvm_vcpu_arch_shared, x) +#define KVM_MASK_RT0x03e0 + +static bool kvm_patching_worked = true; + +static inline void kvm_patch_ins(u32 *inst, u32 new_inst) +{ + *inst = new_inst; + flush_icache_range((ulong)inst, (ulong)inst + 4); +} + +static void kvm_map_magic_page(void *data) +{ + kvm_hypercall2(KVM_HC_PPC_MAP_MAGIC_PAGE, + KVM_MAGIC_PAGE, /* Physical Address */ + KVM_MAGIC_PAGE); /* Effective Address */ +} + +static void kvm_check_ins(u32 *inst) +{ + u32 _inst = *inst; + u32 inst_no_rt = _inst ~KVM_MASK_RT; + u32 inst_rt = _inst KVM_MASK_RT; + + switch (inst_no_rt) { + } + + switch (_inst) { + } +} + +static void kvm_use_magic_page(void) +{ + u32 *p; + u32 *start, *end; + u32 tmp; + + /* Tell the host to map the magic page to -4096 on all CPUs */ + on_each_cpu(kvm_map_magic_page, NULL, 1); + + /* Quick self-test to see if the mapping works */ + if (__get_user(tmp, (u32*)KVM_MAGIC_PAGE)) { + kvm_patching_worked = false; + return; + } + + /* Now loop through all code and find instructions */ + start = (void*)_stext; + end = (void*)_etext; + + for (p = start; p end; p++) + kvm_check_ins(p); + + printk(KERN_INFO KVM: Live patching for a fast VM %s\n, +kvm_patching_worked ? worked : failed); +} Rather than have the guest loop through every instruction in its text, why can't you use the existing cputable self-patching mechanism? The kernel already uses that in a number of places to patch itself at runtime in fast paths... see Documentation/powerpc/cpu_features.txt for some background. Since we already know (at build time) the location of code that needs patching, we don't need to scan at all. (I also shudder to think of the number of page faults this scan will incur.) Hollis Blanchard Mentor Graphics, Embedded Systems Division -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/27] KVM: PPC: Introduce shared page
: + kvmppc_e500_tlb_uninit(vcpu_e500); uninit_vcpu: kvm_vcpu_uninit(vcpu); free_vcpu: @@ -131,6 +137,7 @@ void kvmppc_core_vcpu_free(struct kvm_vcpu *vcpu) { struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu); + free_page((unsigned long)vcpu-arch.shared); kvmppc_e500_tlb_uninit(vcpu_e500); kvm_vcpu_uninit(vcpu); kmem_cache_free(kvm_vcpu_cache, vcpu_e500); Why not put all this in a common function like kvm_arch_vcpu_init()? There are layers of shared code inside arch/powerpc/kvm: e.g. powerpc.c - booke.c - 44x.c... Hollis Blanchard Mentor Graphics, Embedded Systems Division -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 17/27] KVM: PPC: KVM PV guest stubs
On 07/29/2010 05:47 AM, Alexander Graf wrote: We will soon start and replace instructions from the text section with other, paravirtualized versions. To ease the readability of those patches I split out the generic looping and magic page mapping code out. This patch still only contains stubs. But at least it loops through the text section :). Signed-off-by: Alexander Grafag...@suse.de --- v1 - v2: - kvm guest patch framework: introduce patch_ins v2 - v3: - add self-test in guest code - remove superfluous new lines in generic guest code --- arch/powerpc/kernel/kvm.c | 95 + 1 files changed, 95 insertions(+), 0 deletions(-) diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c index a5ece71..e93366f 100644 --- a/arch/powerpc/kernel/kvm.c +++ b/arch/powerpc/kernel/kvm.c @@ -33,6 +33,62 @@ #define KVM_MAGIC_PAGE(-4096L) #define magic_var(x) KVM_MAGIC_PAGE + offsetof(struct kvm_vcpu_arch_shared, x) +#define KVM_MASK_RT0x03e0 + +static bool kvm_patching_worked = true; + +static inline void kvm_patch_ins(u32 *inst, u32 new_inst) +{ + *inst = new_inst; + flush_icache_range((ulong)inst, (ulong)inst + 4); +} + +static void kvm_map_magic_page(void *data) +{ + kvm_hypercall2(KVM_HC_PPC_MAP_MAGIC_PAGE, + KVM_MAGIC_PAGE, /* Physical Address */ + KVM_MAGIC_PAGE); /* Effective Address */ +} + +static void kvm_check_ins(u32 *inst) +{ + u32 _inst = *inst; + u32 inst_no_rt = _inst ~KVM_MASK_RT; + u32 inst_rt = _inst KVM_MASK_RT; + + switch (inst_no_rt) { + } + + switch (_inst) { + } +} + +static void kvm_use_magic_page(void) +{ + u32 *p; + u32 *start, *end; + u32 tmp; + + /* Tell the host to map the magic page to -4096 on all CPUs */ + on_each_cpu(kvm_map_magic_page, NULL, 1); + + /* Quick self-test to see if the mapping works */ + if (__get_user(tmp, (u32*)KVM_MAGIC_PAGE)) { + kvm_patching_worked = false; + return; + } + + /* Now loop through all code and find instructions */ + start = (void*)_stext; + end = (void*)_etext; + + for (p = start; p end; p++) + kvm_check_ins(p); + + printk(KERN_INFO KVM: Live patching for a fast VM %s\n, +kvm_patching_worked ? worked : failed); +} Rather than have the guest loop through every instruction in its text, why can't you use the existing cputable self-patching mechanism? The kernel already uses that in a number of places to patch itself at runtime in fast paths... see Documentation/powerpc/cpu_features.txt for some background. Since we already know (at build time) the location of code that needs patching, we don't need to scan at all. (I also shudder to think of the number of page faults this scan will incur.) Hollis Blanchard Mentor Graphics, Embedded Systems Division -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/27] KVM: PPC: Introduce shared page
: + kvmppc_e500_tlb_uninit(vcpu_e500); uninit_vcpu: kvm_vcpu_uninit(vcpu); free_vcpu: @@ -131,6 +137,7 @@ void kvmppc_core_vcpu_free(struct kvm_vcpu *vcpu) { struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu); + free_page((unsigned long)vcpu-arch.shared); kvmppc_e500_tlb_uninit(vcpu_e500); kvm_vcpu_uninit(vcpu); kmem_cache_free(kvm_vcpu_cache, vcpu_e500); Why not put all this in a common function like kvm_arch_vcpu_init()? There are layers of shared code inside arch/powerpc/kvm: e.g. powerpc.c - booke.c - 44x.c... Hollis Blanchard Mentor Graphics, Embedded Systems Division -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/4] fix PowerPC 440 Bamboo platform emulation
These patches get the PowerPC Bamboo platform working again. I've re-written two of the patches based on feedback from qemu-devel. Note that this platform still only works in conjunction with KVM, since the PowerPC 440 MMU is still not accurately emulated by TCG. -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/4] ppc4xx: correct SDRAM controller warning message condition
The message Truncating memory to %d MiB to fit SDRAM controller limits should be displayed only when a user chooses an amount of RAM which can't be represented by the PPC 4xx SDRAM controller (e.g. 129MB, which would only be valid if the controller supports a bank size of 1MB). Signed-off-by: Hollis Blanchard hol...@penguinppc.org --- hw/ppc4xx_devs.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/hw/ppc4xx_devs.c b/hw/ppc4xx_devs.c index b15db81..be130c4 100644 --- a/hw/ppc4xx_devs.c +++ b/hw/ppc4xx_devs.c @@ -684,7 +684,7 @@ ram_addr_t ppc4xx_sdram_adjust(ram_addr_t ram_size, int nr_banks, } ram_size -= size_left; -if (ram_size) +if (size_left) printf(Truncating memory to %d MiB to fit SDRAM controller limits.\n, (int)(ram_size 20)); -- 1.7.2 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/4] ppc4xx: load Bamboo kernel, initrd, and fdt at fixed addresses
We can't use the return value of load_uimage() for the kernel because it can't account for BSS size, and the PowerPC kernel does not relocate blobs before zeroing BSS. Instead, we now load at the fixed addresses chosen by u-boot (the normal firmware for the board). Signed-off-by: Hollis Blanchard hol...@penguinppc.org --- hw/ppc440_bamboo.c | 39 ++- 1 files changed, 18 insertions(+), 21 deletions(-) This fixes a critical bug in PowerPC 440 Bamboo board emulation. diff --git a/hw/ppc440_bamboo.c b/hw/ppc440_bamboo.c index d471d5d..34ddf45 100644 --- a/hw/ppc440_bamboo.c +++ b/hw/ppc440_bamboo.c @@ -27,6 +27,11 @@ #define BINARY_DEVICE_TREE_FILE bamboo.dtb +/* from u-boot */ +#define KERNEL_ADDR 0x100 +#define FDT_ADDR 0x180 +#define RAMDISK_ADDR 0x190 + static int bamboo_load_device_tree(target_phys_addr_t addr, uint32_t ramsize, target_phys_addr_t initrd_base, @@ -98,10 +103,8 @@ static void bamboo_init(ram_addr_t ram_size, uint64_t elf_lowaddr; target_phys_addr_t entry = 0; target_phys_addr_t loadaddr = 0; -target_long kernel_size = 0; -target_ulong initrd_base = 0; target_long initrd_size = 0; -target_ulong dt_base = 0; +int success; int i; /* Setup CPU. */ @@ -118,15 +121,15 @@ static void bamboo_init(ram_addr_t ram_size, /* Load kernel. */ if (kernel_filename) { -kernel_size = load_uimage(kernel_filename, entry, loadaddr, NULL); -if (kernel_size 0) { -kernel_size = load_elf(kernel_filename, NULL, NULL, elf_entry, - elf_lowaddr, NULL, 1, ELF_MACHINE, 0); +success = load_uimage(kernel_filename, entry, loadaddr, NULL); +if (success 0) { +success = load_elf(kernel_filename, NULL, NULL, elf_entry, + elf_lowaddr, NULL, 1, ELF_MACHINE, 0); entry = elf_entry; loadaddr = elf_lowaddr; } /* XXX try again as binary */ -if (kernel_size 0) { +if (success 0) { fprintf(stderr, qemu: could not load kernel '%s'\n, kernel_filename); exit(1); @@ -135,26 +138,20 @@ static void bamboo_init(ram_addr_t ram_size, /* Load initrd. */ if (initrd_filename) { -initrd_base = kernel_size + loadaddr; -initrd_size = load_image_targphys(initrd_filename, initrd_base, - ram_size - initrd_base); +initrd_size = load_image_targphys(initrd_filename, RAMDISK_ADDR, + ram_size - RAMDISK_ADDR); if (initrd_size 0) { -fprintf(stderr, qemu: could not load initial ram disk '%s'\n, -initrd_filename); +fprintf(stderr, qemu: could not load ram disk '%s' at %x\n, +initrd_filename, RAMDISK_ADDR); exit(1); } } /* If we're loading a kernel directly, we must load the device tree too. */ if (kernel_filename) { -if (initrd_base) -dt_base = initrd_base + initrd_size; -else -dt_base = kernel_size + loadaddr; - -if (bamboo_load_device_tree(dt_base, ram_size, -initrd_base, initrd_size, kernel_cmdline) 0) { +if (bamboo_load_device_tree(FDT_ADDR, ram_size, RAMDISK_ADDR, +initrd_size, kernel_cmdline) 0) { fprintf(stderr, couldn't load device tree\n); exit(1); } @@ -163,7 +160,7 @@ static void bamboo_init(ram_addr_t ram_size, /* Set initial guest state. */ env-gpr[1] = (1620) - 8; -env-gpr[3] = dt_base; +env-gpr[3] = FDT_ADDR; env-nip = entry; /* XXX we currently depend on KVM to create some initial TLB entries. */ } -- 1.7.2 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/4] Fix make install with a cross toolchain
We must be able to use a non-native strip executable, but not all versions of 'install' support the --strip-program option (e.g. OpenBSD). Accordingly, we can't use 'install -s', and we must run strip separately. Signed-off-by: Hollis Blanchard hol...@penguinppc.org Cc: blauwir...@gmail.com --- Makefile.target |5 - configure |4 +++- 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/Makefile.target b/Makefile.target index 8a9c427..00bf6f9 100644 --- a/Makefile.target +++ b/Makefile.target @@ -326,7 +326,10 @@ clean: install: all ifneq ($(PROGS),) - $(INSTALL) -m 755 $(STRIP_OPT) $(PROGS) $(DESTDIR)$(bindir) + $(INSTALL) -m 755 $(PROGS) $(DESTDIR)$(bindir) +ifneq ($(STRIP),) + $(STRIP) $(patsubst %,$(DESTDIR)$(bindir)/%,$(PROGS)) +endif endif # Include automatically generated dependency files diff --git a/configure b/configure index a20371c..146dac0 100755 --- a/configure +++ b/configure @@ -80,6 +80,7 @@ make=make install=install objcopy=objcopy ld=ld +strip=strip helper_cflags= libs_softmmu= libs_tools= @@ -125,6 +126,7 @@ cc=${cross_prefix}${cc} ar=${cross_prefix}${ar} objcopy=${cross_prefix}${objcopy} ld=${cross_prefix}${ld} +strip=${cross_prefix}${strip} # default flags for all hosts QEMU_CFLAGS=-fno-strict-aliasing $QEMU_CFLAGS @@ -2227,7 +2229,7 @@ if test $debug = yes ; then echo CONFIG_DEBUG_EXEC=y $config_host_mak fi if test $strip_opt = yes ; then - echo STRIP_OPT=-s $config_host_mak + echo STRIP=${strip} $config_host_mak fi if test $bigendian = yes ; then echo HOST_WORDS_BIGENDIAN=y $config_host_mak -- 1.7.2 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/4] ppc4xx: don't unregister RAM at reset
The PowerPC 4xx SDRAM controller emulation unregisters RAM in its reset callback. However, qemu_system_reset() is now called at initialization time, so all RAM is unregistered before starting the guest (!). Signed-off-by: Hollis Blanchard hol...@penguinppc.org --- hw/ppc4xx_devs.c |1 - 1 files changed, 0 insertions(+), 1 deletions(-) This fixes a critical bug in PowerPC 440 Bamboo board emulation. diff --git a/hw/ppc4xx_devs.c b/hw/ppc4xx_devs.c index be130c4..7f698b8 100644 --- a/hw/ppc4xx_devs.c +++ b/hw/ppc4xx_devs.c @@ -619,7 +619,6 @@ static void sdram_reset (void *opaque) /* We pre-initialize RAM banks */ sdram-status = 0x; sdram-cfg = 0x0080; -sdram_unmap_bcr(sdram); } void ppc4xx_sdram_init (CPUState *env, qemu_irq irq, int nbanks, -- 1.7.2 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH] loader: pad kernel size when loaded from a uImage
On Sun, Aug 1, 2010 at 5:36 AM, Edgar E. Iglesias edgar.igles...@gmail.com wrote: On Sat, Jul 31, 2010 at 12:56:42AM +0200, Edgar E. Iglesias wrote: On Thu, Jul 29, 2010 at 06:48:24PM -0700, Hollis Blanchard wrote: The kernel's BSS size is lost by mkimage, which only considers file size. As a result, loading other blobs (e.g. device tree, initrd) immediately after the kernel location can result in them being zeroed by the kernel's BSS initialization code. Signed-off-by: Hollis Blanchard hol...@penguinppc.org --- hw/loader.c | 7 +++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/hw/loader.c b/hw/loader.c index 79a6f95..35bc25a 100644 --- a/hw/loader.c +++ b/hw/loader.c @@ -507,6 +507,13 @@ int load_uimage(const char *filename, target_phys_addr_t *ep, ret = hdr-ih_size; + /* The kernel's BSS size is lost by mkimage, which only considers file + * size. We don't know how big it is, but we do know we can't place + * anything immediately after the kernel. The padding seems like it should + * be proportional to overall file size, but we also make sure it's at + * least 4-byte aligned. */ + ret += (hdr-ih_size / 16) ~0x3; Maybe it's only me, but it feels a bit akward to push down this kind of knowledge down the abstraction layers. Does it work for you to have your caller of load_uimage apply whatever resizing magic needed for your kernel and arch? Ayway, IMO the conventions of where to pass blobs from the bootloader to the loaded image are an agreement between the bootloader and the loaded code. The formats or mechanisms to load the image should need to be involved that much. For example in this particular case, other archs (e.g, MicroBlaze) might not need any magic. The MicroBlaze linux kernel moves cmdline and device tree blobs into safe areas prior to .bss initialization. Are you claiming that's the common case? FWIW, PowerPC and ARM don't seem to. I wouldn't expect such logic except in reaction to a specific bug (i.e. a qemu/firmware loader bug). The load_uimage() interface claims to report the size of the kernel it loaded. If you argue that it shouldn't try to do that (and indeed you could argue it's not *possible* to do that accurately), that logic should be completely removed. The current behavior is worse than not knowing at all: callers *think* they know, but it's guaranteed to be wrong. Of course, if you do want to remove the size, then callers are left with even less information than they had before. In that case, you tell me: where should I hardcode initrd loading? Anyways, you don't even use load_uimage() in microblaze, and if you did, you wouldn't be obligated to use the size return value, so fixing this issue for everybody else doesn't limit you at all. -Hollis -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH] loader: pad kernel size when loaded from a uImage
On Mon, Aug 2, 2010 at 11:57 AM, Edgar E. Iglesias edgar.igles...@gmail.com wrote: On Mon, Aug 02, 2010 at 10:59:11AM -0700, Hollis Blanchard wrote: On Sun, Aug 1, 2010 at 5:36 AM, Edgar E. Iglesias edgar.igles...@gmail.com wrote: On Sat, Jul 31, 2010 at 12:56:42AM +0200, Edgar E. Iglesias wrote: On Thu, Jul 29, 2010 at 06:48:24PM -0700, Hollis Blanchard wrote: The kernel's BSS size is lost by mkimage, which only considers file size. As a result, loading other blobs (e.g. device tree, initrd) immediately after the kernel location can result in them being zeroed by the kernel's BSS initialization code. Signed-off-by: Hollis Blanchard hol...@penguinppc.org --- hw/loader.c | 7 +++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/hw/loader.c b/hw/loader.c index 79a6f95..35bc25a 100644 --- a/hw/loader.c +++ b/hw/loader.c @@ -507,6 +507,13 @@ int load_uimage(const char *filename, target_phys_addr_t *ep, ret = hdr-ih_size; + /* The kernel's BSS size is lost by mkimage, which only considers file + * size. We don't know how big it is, but we do know we can't place + * anything immediately after the kernel. The padding seems like it should + * be proportional to overall file size, but we also make sure it's at + * least 4-byte aligned. */ + ret += (hdr-ih_size / 16) ~0x3; Maybe it's only me, but it feels a bit akward to push down this kind of knowledge down the abstraction layers. Does it work for you to have your caller of load_uimage apply whatever resizing magic needed for your kernel and arch? Ayway, IMO the conventions of where to pass blobs from the bootloader to the loaded image are an agreement between the bootloader and the loaded code. The formats or mechanisms to load the image should need to be involved that much. For example in this particular case, other archs (e.g, MicroBlaze) might not need any magic. The MicroBlaze linux kernel moves cmdline and device tree blobs into safe areas prior to .bss initialization. Are you claiming that's the common case? FWIW, PowerPC and ARM don't seem to. I wouldn't expect such logic except in reaction to a specific bug (i.e. a qemu/firmware loader bug). I'm not trying to claim it's the common case, but it exists. It exists and will remain unaffected by this patch, while the common case will be improved. The load_uimage() interface claims to report the size of the kernel it loaded. If you argue that it shouldn't try to do that (and indeed you The way I understand it, it reports the size of what got loaded. The difference between what got loaded and the size of the loaded file in memory is a subtlety that is not at all clear from the code, and that is precisely why I propose centralizing the logic to handle it. It would be very difficult for load_uimage to figure out what memory areas are beeing touched prior to .bss init (or the point where the passed blob is used). could argue it's not *possible* to do that accurately), that logic Right, its very hard for it to guess what memory areas are safe. should be completely removed. The current behavior is worse than not knowing at all: callers *think* they know, but it's guaranteed to be wrong. Of course, if you do want to remove the size, then callers are left with even less information than they had before. In that case, you I think returning the size of the loaded image has a value, for example for archs that move away the blobs before touching any memory outside their image. Bootloaders for those archs can put some blobs right after the loaded image. You mean the one architecture, which by the way doesn't even use this API? That doesn't seem like a strong argument to me. Anyways, it's just as much work to relocate an initrd from a padded address as it is from a closer address, so there is no downside. The fact remains that the current API is broken by design, or to be charitable violates the principle of least surprise. With the exception of microblaze, everybody who calls load_uimage() must understand the nuances of the return value and adjust it (or ignore it) accordingly. Why wouldn't we consolidate that logic? tell me: where should I hardcode initrd loading? Not sure, but I'd guess somewhere close to where you are calling load_uimage from (it wasn't clear to me where that was). Sorry, let me rephrase. At what address should I hardcode my initrd? What about my device tree? As a followup, why not lower, or higher? Also, how can I know in the code if I chose wrong, what will the user-visible failure be, and how difficult will that be to debug? In summary, this patch protects users and developers. If I move it to be PowerPC-specific, it will protect only PowerPC users and developers, which is clearly a much smaller number. Debating whether theoretically *all* users and developers would benefit from
Re: [PATCH] PPC4xx: don't unregister RAM at reset
On Mon, Aug 2, 2010 at 1:41 AM, Alexander Graf ag...@suse.de wrote: On 30.07.2010, at 03:48, Hollis Blanchard wrote: The PowerPC 4xx SDRAM controller emulation unregisters RAM in its reset callback. However, qemu_system_reset() is now called at initialization time, so RAM is unregistered before starting the guest. So the registration should be moved to reset now, no? How is the reset different from boot? How did a reset work before? As far as I can tell, no other platform unregisters and re-registers memory at reset, so that is a difference between reset and boot. Maybe I don't understand your other question. Before qemu_system_reset() was called at initialization time, memory was not unregistered, and therefore the platform had memory and could boot. -Hollis -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH] loader: pad kernel size when loaded from a uImage
On Mon, Aug 2, 2010 at 12:56 PM, Edgar E. Iglesias edgar.igles...@gmail.com wrote: On Mon, Aug 02, 2010 at 12:33:54PM -0700, Hollis Blanchard wrote: You mean the one architecture, which by the way doesn't even use this API? That doesn't seem like a strong argument to me. Anyways, it's Are we looking at the same code? I don't know. Grep for load_uimage in qemu. 4 archs use it, PPC, ARM, m68k and MB. I see the following: 1 75 hw/an5206.c an5206_init kernel_size = load_uimage(kernel_filename, entry, NULL, NULL); 2233 hw/arm_boot.c arm_load_kernel kernel_size = load_uimage(info-kernel_filename, entry, NULL, 3 50 hw/dummy_m68k.c dummy_m68k_init kernel_size = load_uimage(kernel_filename, entry, NULL, NULL); 4 14 hw/loader.h uint64_t int load_uimage(const char *filename, target_phys_addr_t *ep, 5277 hw/mcf5208.c mcf5208evb_init kernel_size = load_uimage(kernel_filename, entry, NULL, NULL); 6121 hw/ppc440_bamboo.c bamboo_init kernel_size = load_uimage(kernel...ename, entry, loadaddr, NULL); 7235 hw/ppce500_mpc8544ds.c mpc8544ds_init kernel_size = load_uimage(kernel...ename, entry, loadaddr, NULL); That makes 2x ColdFire, ARM, M68K, 2x PowerPC. hw/petalogix_s3adsp1800_mmu.c is the only MicroBlaze I can see, and it only loads ELF and binary kernels, not uImages. Of those archs, only 2 actually use the return value of load_uimage to decide where to place blobs. PPC and MB. MB doesn't want any magic applied to the return value. That leaves us with _ONE_ single arch that needs magic that IMO is broken. You are trying to guess the size of the loaded image's .bss area by adding a 16th of the uimage size? Accounting for BSS hardly counts as magic, I think. :) just as much work to relocate an initrd from a padded address as it is from a closer address, so there is no downside. The fact remains that the current API is broken by design, or to be charitable violates the principle of least surprise. With the exception of microblaze, everybody who calls load_uimage() must understand the nuances of the return value and adjust it (or ignore it) accordingly. Why wouldn't we consolidate that logic? I don't see how guessing that the .bss area is a 16th of the loaded image makes this call any less confusing. I agree it's arbitrary, but it's only arbitrary in one place. It's also well-commented (IMHO), which is more than can be said for the current code. tell me: where should I hardcode initrd loading? Not sure, but I'd guess somewhere close to where you are calling load_uimage from (it wasn't clear to me where that was). Sorry, let me rephrase. At what address should I hardcode my initrd? What about my device tree? As a followup, why not lower, or higher? You should be putting them at the same addresses as uboot puts them. Fine. It's arbitrary in u-boot too, but at least it will be consistent. -Hollis -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] KVM: PPC: initialize IVORs in addition to IVPR
Developers can now tell at a glace the exact type of the premature interrupt, instead of just knowing that there was some interrupt. Signed-off-by: Hollis Blanchard hol...@penguinppc.org diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -466,15 +466,19 @@ int kvmppc_handle_exit(struct kvm_run *r /* Initial guest state: 16MB mapping 0 - 0, PC = 0, MSR = 0, R1 = 16MB */ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu) { + int i; + vcpu-arch.pc = 0; vcpu-arch.msr = 0; kvmppc_set_gpr(vcpu, 1, (1620) - 8); /* -8 for the callee-save LR slot */ vcpu-arch.shadow_pid = 1; - /* Eye-catching number so we know if the guest takes an interrupt -* before it's programmed its own IVPR. */ + /* Eye-catching numbers so we know if the guest takes an interrupt +* before it's programmed its own IVPR/IVORs. */ vcpu-arch.ivpr = 0x; + for (i = 0; i BOOKE_IRQPRIO_MAX; i++) + vcpu-arch.ivor[i] = 0x7700 | i * 4; kvmppc_init_timing_stats(vcpu); -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] KVM: PPC: fix compilation of dump tlbs debug function
Signed-off-by: Hollis Blanchard hol...@penguinppc.org diff --git a/arch/powerpc/kvm/44x_tlb.c b/arch/powerpc/kvm/44x_tlb.c --- a/arch/powerpc/kvm/44x_tlb.c +++ b/arch/powerpc/kvm/44x_tlb.c @@ -47,6 +47,7 @@ #ifdef DEBUG void kvmppc_dump_tlbs(struct kvm_vcpu *vcpu) { + struct kvmppc_vcpu_44x *vcpu_44x = to_44x(vcpu); struct kvmppc_44x_tlbe *tlbe; int i; -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] PPC4xx: don't unregister RAM at reset
The PowerPC 4xx SDRAM controller emulation unregisters RAM in its reset callback. However, qemu_system_reset() is now called at initialization time, so RAM is unregistered before starting the guest. Signed-off-by: Hollis Blanchard hol...@penguinppc.org --- hw/ppc4xx_devs.c |1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/hw/ppc4xx_devs.c b/hw/ppc4xx_devs.c index be130c4..7f698b8 100644 --- a/hw/ppc4xx_devs.c +++ b/hw/ppc4xx_devs.c @@ -619,7 +619,6 @@ static void sdram_reset (void *opaque) /* We pre-initialize RAM banks */ sdram-status = 0x; sdram-cfg = 0x0080; -sdram_unmap_bcr(sdram); } void ppc4xx_sdram_init (CPUState *env, qemu_irq irq, int nbanks, -- 1.7.1.1 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] loader: pad kernel size when loaded from a uImage
The kernel's BSS size is lost by mkimage, which only considers file size. As a result, loading other blobs (e.g. device tree, initrd) immediately after the kernel location can result in them being zeroed by the kernel's BSS initialization code. Signed-off-by: Hollis Blanchard hol...@penguinppc.org --- hw/loader.c |7 +++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/hw/loader.c b/hw/loader.c index 79a6f95..35bc25a 100644 --- a/hw/loader.c +++ b/hw/loader.c @@ -507,6 +507,13 @@ int load_uimage(const char *filename, target_phys_addr_t *ep, ret = hdr-ih_size; + /* The kernel's BSS size is lost by mkimage, which only considers file +* size. We don't know how big it is, but we do know we can't place +* anything immediately after the kernel. The padding seems like it should +* be proportional to overall file size, but we also make sure it's at +* least 4-byte aligned. */ + ret += (hdr-ih_size / 16) ~0x3; + out: if (data) qemu_free(data); -- 1.7.1.1 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 27/27] KVM: PPC: Add Documentation about PV interface
[Resending...] Please reconcile this with http://www.linux-kvm.org/page/PowerPC_Hypercall_ABI, which has been discussed in the (admittedly closed) Power.org embedded hypervisor working group. Bear in mind that other hypervisors are already implementing the documented ABI, so if you have concerns, you should probably raise them with that audience... -Hollis On Thu, Jul 1, 2010 at 3:43 AM, Alexander Graf ag...@suse.de wrote: We just introduced a new PV interface that screams for documentation. So here it is - a shiny new and awesome text file describing the internal works of the PPC KVM paravirtual interface. Signed-off-by: Alexander Graf ag...@suse.de --- v1 - v2: - clarify guest implementation - clarify that privileged instructions still work - explain safe MSR bits - Fix dsisr patch description - change hypervisor calls to use new register values --- Documentation/kvm/ppc-pv.txt | 185 ++ 1 files changed, 185 insertions(+), 0 deletions(-) create mode 100644 Documentation/kvm/ppc-pv.txt diff --git a/Documentation/kvm/ppc-pv.txt b/Documentation/kvm/ppc-pv.txt new file mode 100644 index 000..82de6c6 --- /dev/null +++ b/Documentation/kvm/ppc-pv.txt @@ -0,0 +1,185 @@ +The PPC KVM paravirtual interface += + +The basic execution principle by which KVM on PowerPC works is to run all kernel +space code in PR=1 which is user space. This way we trap all privileged +instructions and can emulate them accordingly. + +Unfortunately that is also the downfall. There are quite some privileged +instructions that needlessly return us to the hypervisor even though they +could be handled differently. + +This is what the PPC PV interface helps with. It takes privileged instructions +and transforms them into unprivileged ones with some help from the hypervisor. +This cuts down virtualization costs by about 50% on some of my benchmarks. + +The code for that interface can be found in arch/powerpc/kernel/kvm* + +Querying for existence +== + +To find out if we're running on KVM or not, we overlay the PVR register. Usually +the PVR register contains an id that identifies your CPU type. If, however, you +pass KVM_PVR_PARA in the register that you want the PVR result in, the register +still contains KVM_PVR_PARA after the mfpvr call. + + LOAD_REG_IMM(r5, KVM_PVR_PARA) + mfpvr r5 + [r5 still contains KVM_PVR_PARA] + +Once determined to run under a PV capable KVM, you can now use hypercalls as +described below. + +PPC hypercalls +== + +The only viable ways to reliably get from guest context to host context are: + + 1) Call an invalid instruction + 2) Call the sc instruction with a parameter to sc + 3) Call the sc instruction with parameters in GPRs + +Method 1 is always a bad idea. Invalid instructions can be replaced later on +by valid instructions, rendering the interface broken. + +Method 2 also has downfalls. If the parameter to sc is != 0 the spec is +rather unclear if the sc is targeted directly for the hypervisor or the +supervisor. It would also require that we read the syscall issuing instruction +every time a syscall is issued, slowing down guest syscalls. + +Method 3 is what KVM uses. We pass magic constants (KVM_SC_MAGIC_R0 and +KVM_SC_MAGIC_R3) in r0 and r3 respectively. If a syscall instruction with these +magic values arrives from the guest's kernel mode, we take the syscall as a +hypercall. + +The parameters are as follows: + + r0 KVM_SC_MAGIC_R0 + r3 KVM_SC_MAGIC_R3 Return code + r4 Hypercall number + r5 First parameter + r6 Second parameter + r7 Third parameter + r8 Fourth parameter + +Hypercall definitions are shared in generic code, so the same hypercall numbers +apply for x86 and powerpc alike. + +The magic page +== + +To enable communication between the hypervisor and guest there is a new shared +page that contains parts of supervisor visible register state. The guest can +map this shared page using the KVM hypercall KVM_HC_PPC_MAP_MAGIC_PAGE. + +With this hypercall issued the guest always gets the magic page mapped at the +desired location in effective and physical address space. For now, we always +map the page to -4096. This way we can access it using absolute load and store +functions. The following instruction reads the first field of the magic page: + + ld rX, -4096(0) + +The interface is designed to be extensible should there be need later to add +additional registers to the magic page. If you add fields to the magic page, +also define a new hypercall feature to indicate that the host can give you more +registers. Only if the host supports the additional
Re: Keep index within boundaries in kvmppc_44x_emul_tlbwe()
On Sun, May 9, 2010 at 8:26 AM, Roel Kluin roel.kl...@gmail.com wrote: An index of KVM44x_GUEST_TLB_SIZE is already one too large. Signed-off-by: Roel Kluin roel.kl...@gmail.com Acked-by: Hollis Blanchard hol...@penguinppc.org Thanks Roel. -Hollis -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM on 440GP
On Fri, Jan 22, 2010 at 12:04 PM, Corey Minyard miny...@acm.org wrote: Here's how far I can get now: r...@xilinx-ml507:~# ./qemu-system-ppcemb --enable-kvm -nographic -m 64 -M bambo o -kernel uImage.bamboo -L . -append -serial tcp::,server QEMU waiting for connection on: tcp:0.0.0.0:,server Truncating memory to 64 MiB to fit SDRAM controller limits. QEMU 0.12.50 monitor - type 'help' for more information (qemu) info cpus * CPU #0: nip=0x (qemu) info registers NIP LR CTR XER MSR HID0 0300 HF idx 0 TB 0bb4 DECR GPR00 00f8 GPR04 GPR08 GPR12 GPR16 GPR20 GPR24 GPR28 CR [ - - - - - - - - ] RES FPR00 FPR04 FPR08 FPR12 FPR16 FPR20 FPR24 FPR28 FPSCR SRR0 SRR1 SDR1 101d23e0 (qemu) x/10x 0 : 0x 0x 0x 0x 0010: 0x 0x 0x 0x 0020: 0x 0x (qemu) So the ROM doesn't seem to be set up properly, though bamboo.dtb is in the current directory. I'm wondering if it is something bad about the memory setup. I was just given an Ebony (440GP) board, and I can verify your results. Did you manage to make any more progress? Unfortunately I will be traveling for a couple weeks... Also, I don't suppose you have access to any other 440 variants? I believe the 440GP is the only one to use the 440x4 core, while the others use 440x5. -Hollis -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3] kvmppc/booke: Set ESR and DEAR when inject interrupt to guest
On Tue, Feb 2, 2010 at 3:44 AM, Liu Yu yu@freescale.com wrote: Old method prematurely sets ESR and DEAR. Move this part after we decide to inject interrupt, which is more like hardware behave. Signed-off-by: Liu Yu yu@freescale.com Acked-by: Hollis Blanchard hol...@penguinppc.org -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3] kvmppc/booke: Set ESR and DEAR when inject interrupt to guest
On Tue, Feb 2, 2010 at 3:44 AM, Liu Yu yu@freescale.com wrote: Old method prematurely sets ESR and DEAR. Move this part after we decide to inject interrupt, which is more like hardware behave. Signed-off-by: Liu Yu yu@freescale.com Acked-by: Hollis Blanchard hol...@penguinppc.org -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvmppc/booke: Set ESR and DEAR when inject interrupt to guest
On Mon, Jan 25, 2010 at 3:32 AM, Liu Yu-B13201 b13...@freescale.com wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Friday, January 22, 2010 7:33 PM To: Liu Yu-B13201 Cc: hol...@penguinppc.org; kvm-ppc@vger.kernel.org; k...@vger.kernel.org Subject: Re: [PATCH] kvmppc/booke: Set ESR and DEAR when inject interrupt to guest On 22.01.2010, at 12:27, Liu Yu-B13201 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Friday, January 22, 2010 7:13 PM To: Liu Yu-B13201 Cc: hol...@penguinppc.org; kvm-ppc@vger.kernel.org; k...@vger.kernel.org Subject: Re: [PATCH] kvmppc/booke: Set ESR and DEAR when inject interrupt to guest On 22.01.2010, at 11:54, Liu Yu wrote: Old method prematurely sets ESR and DEAR. Move this part after we decide to inject interrupt, and make it more like hardware behave. Signed-off-by: Liu Yu yu@freescale.com --- arch/powerpc/kvm/booke.c | 24 ++-- arch/powerpc/kvm/emulate.c | 2 -- 2 files changed, 14 insertions(+), 12 deletions(-) @@ -286,15 +295,12 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, break; case BOOKE_INTERRUPT_DATA_STORAGE: - vcpu-arch.dear = vcpu-arch.fault_dear; - vcpu-arch.esr = vcpu-arch.fault_esr; kvmppc_booke_queue_irqprio(vcpu, BOOKE_IRQPRIO_DATA_STORAGE); kvmppc_booke_queue_data_storage(vcpu, vcpu-arch.fault_esr, vcpu-arch.fault_dear); kvmppc_account_exit(vcpu, DSI_EXITS); r = RESUME_GUEST; break; case BOOKE_INTERRUPT_INST_STORAGE: - vcpu-arch.esr = vcpu-arch.fault_esr; kvmppc_booke_queue_irqprio(vcpu, BOOKE_IRQPRIO_INST_STORAGE); kvmppc_booke_queue_inst_storage(vcpu, vcpu-arch.fault_esr); Not sure if this is redundant, as we already have fault_esr. Or should we ignore what hareware is and create a new esr to guest? On Book3S I take the SRR1 we get from the host as inspiration of what to pass to the guest as SRR1. I think we should definitely be able to inject a fault that we didn't get in that exact form from the exit path. I'm also not sure if something could clobber fault_esr if another interrupt takes precedence. Say a #MC. No as far as I know. And if yes, the clobber could as well happen before we copy it. Hollis, what do you think we should do here? I'm torn, and in some ways it's not that important right now. However, I think it makes sense to add something like vcpu-queued_esr as a separate field from vcpu-fault_esr. The use case I'm thinking of is a debugger wanting to invoke guest kernel to provide a translation for an address not currently present in the TLB (i.e. not currently visible to the debugger). -Hollis -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM on 440GP
On Fri, Jan 22, 2010 at 7:27 AM, Corey Minyard miny...@acm.org wrote: Corey Minyard wrote: I'm playing around with KVM on an ebony board (440GP), just trying to get it to work, really. I followed the instructions at http://www.linux-kvm.org/page/PowerPC and I used the 2.6.33 branch of the kvm kernel repository. When I try to run kvm, qemu appears to abort and actually logs me off. Doing a little debugging, I found that qemu_memalign() is calling abort because posix_memalign() is failing. I haven't done any more debugging than that. Well, I discovered that the default memory is 128M, and that's too much memory for a VM running on a machine with 128M. I fixed that problem, and now it's doing something, though no console so not sure what. I guess my questions below and the patch still apply. -corey Since I already had to fix a kernel issue to get it the kernel code to initialize since the platform was reported as ppc440gp, not ppc440, I'm wondering how hard it's going to be to get this working. Does anyone have this working at all? Should I back up to a previous version? Any help would be appreciated. Thanks, -corey Here's the change I made to get kvm in the kernel to initialize: Index: kvm/arch/powerpc/kvm/44x.c === --- kvm.orig/arch/powerpc/kvm/44x.c +++ kvm/arch/powerpc/kvm/44x.c @@ -42,7 +42,7 @@ int kvmppc_core_check_processor_compat(v { int r; - if (strcmp(cur_cpu_spec-platform, ppc440) == 0) + if (strncmp(cur_cpu_spec-platform, ppc440, 6) == 0) r = 0; else r = -ENOTSUPP; Thanks! The patch looks good to me. It's unfortunate that 440GP is reported is ppc440gp, while every other 440 variant is reported is ppc440, but that's just how it goes I guess. It shouldn't be too difficult to get things working, since the cores are more or less the same. There has been a little accidental build breakage introduced in the 440 code recently (work to support the Book S KVM port), but it's all been simple stuff. As for console, you probably want to use qemu's -nographic or at least -serial stdio options. -Hollis -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM on 440GP
On Fri, Jan 22, 2010 at 9:27 AM, Alexander Graf ag...@suse.de wrote: On 22.01.2010, at 18:23, Corey Minyard wrote: Hollis Blanchard wrote: As for console, you probably want to use qemu's -nographic or at least -serial stdio options. Thanks for the info. However, -serial stdio doesn't seem to work. I get: r...@ebony:~# ./qemu-system-ppcemb --enable-kvm -nographic -m 128 -M bamboo -kernel uImage.bamboo -L . -append -m 64 -serial stdio chardev: opening backend stdio failed qemu: could not open serial device 'stdio': Success Haven't seen that one. :( Does the same thing happen if you remove --enable-kvm? If so, it sounds like an issue for qemu-de...@nongnu.org. (No code will actually run like that, since qemu is missing 440 MMU emulation, but it's an easy way to identify the culprit.) BookE KVM uses virtio console, no? I think that was explained on the wiki too. Sure doesn't. Book E SoCs typically contain NS16550-compatible UARTs, so qemu's normal serial emulation works just fine. -Hollis -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/9] KVM: PPC: Emulate trap SRR1 flags properly
On Thu, Jan 7, 2010 at 5:58 PM, Alexander Graf ag...@suse.de wrote: Book3S needs some flags in SRR1 to get to know details about an interrupt. One such example is the trap instruction. It tells the guest kernel that a program interrupt is due to a trap using a bit in SRR1. This patch implements above behavior, making WARN_ON behave like WARN_ON. ... for Book S. It already works properly for Book E, thankyouverymuch. ;) diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 338baf9..e283e44 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -82,8 +82,9 @@ static void kvmppc_booke_queue_irqprio(struct kvm_vcpu *vcpu, set_bit(priority, vcpu-arch.pending_exceptions); } -void kvmppc_core_queue_program(struct kvm_vcpu *vcpu) +void kvmppc_core_queue_program(struct kvm_vcpu *vcpu, ulong flags) { + /* BookE does flags in ESR, so ignore those we get here */ kvmppc_booke_queue_irqprio(vcpu, BOOKE_IRQPRIO_PROGRAM); } Actually, I think Book E prematurely sets ESR, since it's done before the program interrupt is actually delivered. Architecturally, I'm not sure if it's a problem, but philosophically I've always wanted it to work the way you've just implemented for Book S. Anyways, since we can't test changes at the moment (Yu, can you?), I'd settle for a comment to the effect that Book E code *should* do this, but doesn't (rather than the comment above that says it's ok). -Hollis -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/9] KVM: PPC: Emulate trap SRR1 flags properly
On Fri, Jan 8, 2010 at 11:32 AM, Alexander Graf ag...@suse.de wrote: On 08.01.2010, at 20:29, Hollis Blanchard wrote: On Thu, Jan 7, 2010 at 5:58 PM, Alexander Graf ag...@suse.de wrote: Book3S needs some flags in SRR1 to get to know details about an interrupt. One such example is the trap instruction. It tells the guest kernel that a program interrupt is due to a trap using a bit in SRR1. This patch implements above behavior, making WARN_ON behave like WARN_ON. ... for Book S. It already works properly for Book E, thankyouverymuch. ;) diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 338baf9..e283e44 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -82,8 +82,9 @@ static void kvmppc_booke_queue_irqprio(struct kvm_vcpu *vcpu, set_bit(priority, vcpu-arch.pending_exceptions); } -void kvmppc_core_queue_program(struct kvm_vcpu *vcpu) +void kvmppc_core_queue_program(struct kvm_vcpu *vcpu, ulong flags) { + /* BookE does flags in ESR, so ignore those we get here */ kvmppc_booke_queue_irqprio(vcpu, BOOKE_IRQPRIO_PROGRAM); } Actually, I think Book E prematurely sets ESR, since it's done before the program interrupt is actually delivered. Architecturally, I'm not sure if it's a problem, but philosophically I've always wanted it to work the way you've just implemented for Book S. Anyways, since we can't test changes at the moment (Yu, can you?), I'd settle for a comment to the effect that Book E code *should* do this, but doesn't (rather than the comment above that says it's ok). Hm, can't you just write up a patch that removes the comment I put in, does the ESR setting in the function and do an #ifdef in emulate.c? That way we can incrementally improve things. This series is really about Book3S. Improving BookE should go in a different patch. Yes, but I'd rather minimize the behavioral changes as long as we can't test it. -Hollis -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/9] KVM: PPC: Emulate trap SRR1 flags properly
On Fri, Jan 8, 2010 at 11:32 AM, Alexander Graf ag...@suse.de wrote: On 08.01.2010, at 20:29, Hollis Blanchard wrote: On Thu, Jan 7, 2010 at 5:58 PM, Alexander Graf ag...@suse.de wrote: Book3S needs some flags in SRR1 to get to know details about an interrupt. One such example is the trap instruction. It tells the guest kernel that a program interrupt is due to a trap using a bit in SRR1. This patch implements above behavior, making WARN_ON behave like WARN_ON. ... for Book S. It already works properly for Book E, thankyouverymuch. ;) diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 338baf9..e283e44 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -82,8 +82,9 @@ static void kvmppc_booke_queue_irqprio(struct kvm_vcpu *vcpu, set_bit(priority, vcpu-arch.pending_exceptions); } -void kvmppc_core_queue_program(struct kvm_vcpu *vcpu) +void kvmppc_core_queue_program(struct kvm_vcpu *vcpu, ulong flags) { + /* BookE does flags in ESR, so ignore those we get here */ kvmppc_booke_queue_irqprio(vcpu, BOOKE_IRQPRIO_PROGRAM); } Actually, I think Book E prematurely sets ESR, since it's done before the program interrupt is actually delivered. Architecturally, I'm not sure if it's a problem, but philosophically I've always wanted it to work the way you've just implemented for Book S. Anyways, since we can't test changes at the moment (Yu, can you?), I'd settle for a comment to the effect that Book E code *should* do this, but doesn't (rather than the comment above that says it's ok). Hm, can't you just write up a patch that removes the comment I put in, does the ESR setting in the function and do an #ifdef in emulate.c? That way we can incrementally improve things. This series is really about Book3S. Improving BookE should go in a different patch. Yes, but I'd rather minimize the behavioral changes as long as we can't test it. -Hollis -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] Improve DEC handling
On Mon, Dec 21, 2009 at 6:22 AM, Alexander Graf ag...@suse.de wrote: We treated the DEC interrupt like an edge based one. This is not true for Book3s. The DEC keeps firing until mtdec is issued again and thus clears the interrupt line. That's not quite right. The decrementer keeps firing until the top bit is cleared, i.e. with mtdec. However, not *every* mtdec clears it. (Also, I'm pretty sure this varies between Book 3S implementations, e.g. 970 behaves differently than POWERn. I don't remember specific values of n though, and I could be misremembering...) So is this the failure mode? - a decrementer interrupt is delivered - guest does *not* issue mtdec to clear it (ppc64's lazy interrupt disabling?) - guest expects a second decrementer interrupt, but KVM doesn't deliver one In that case, it seems like the real fix would be something like this: void kvmppc_emulate_dec(struct kvm_vcpu *vcpu) { unsigned long dec_nsec; pr_debug(mtDEC: %x\n, vcpu-arch.dec); #ifdef CONFIG_PPC64 /* POWER4+ triggers a dec interrupt if the value is 0 */ if (vcpu-arch.dec 0x8000) { hrtimer_try_to_cancel(vcpu-arch.dec_timer); kvmppc_core_queue_dec(vcpu); + /* keep queuing interrupts until guest clears high MSR bit */ + hrtimer_start(vcpu-arch.dec_timer, ktime_set(0, 100), + HRTIMER_MODE_REL); return; } #endif -Hollis -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] Improve DEC handling
On Mon, Dec 21, 2009 at 10:13 AM, Hollis Blanchard hol...@penguinppc.org wrote: void kvmppc_emulate_dec(struct kvm_vcpu *vcpu) { unsigned long dec_nsec; pr_debug(mtDEC: %x\n, vcpu-arch.dec); #ifdef CONFIG_PPC64 /* POWER4+ triggers a dec interrupt if the value is 0 */ if (vcpu-arch.dec 0x8000) { hrtimer_try_to_cancel(vcpu-arch.dec_timer); kvmppc_core_queue_dec(vcpu); + /* keep queuing interrupts until guest clears high MSR bit */ + hrtimer_start(vcpu-arch.dec_timer, ktime_set(0, 100), + HRTIMER_MODE_REL); return; } #endif Of course, removing the hardcoded 100-ns timer would be better, and indeed we can do that. What we *really* want is to key off of MSR[EE] changes (there's no point in queuing anything until then). So why not move your AGGRESSIVE_DEC check into Book 3S's kvmppc_set_msr()? -Hollis -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] Improve DEC handling
For the record, we've discussed more by IRC, and I think revised patches will be forthcoming. -Hollis -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/3] Improve Decrementor Implementation v2
On Mon, Dec 21, 2009 at 11:21 AM, Alexander Graf ag...@suse.de wrote: We currently have an ugly hack called AGGRESSIVE_DEC that makes the Decrementor either work well for PPC32 or PPC64 targets. This patchset removes that hack, bringing the decrementor implementation closer to real hardware. V1 - V2: - make DEC clearing code on mtdec book3s specific - rename stop - dequeue Alexander Graf (3): Move vector to irqprio resolving to separate function Improve DEC handling Remove AGGRESSIVE_DEC arch/powerpc/include/asm/kvm_ppc.h | 1 + arch/powerpc/kvm/book3s.c | 45 --- arch/powerpc/kvm/booke.c | 5 arch/powerpc/kvm/emulate.c | 3 ++ 4 files changed, 35 insertions(+), 19 deletions(-) Acked-by: Hollis Blanchard hol...@penguinppc.org -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] Improve DEC handling
On Mon, Dec 21, 2009 at 6:22 AM, Alexander Graf ag...@suse.de wrote: We treated the DEC interrupt like an edge based one. This is not true for Book3s. The DEC keeps firing until mtdec is issued again and thus clears the interrupt line. That's not quite right. The decrementer keeps firing until the top bit is cleared, i.e. with mtdec. However, not *every* mtdec clears it. (Also, I'm pretty sure this varies between Book 3S implementations, e.g. 970 behaves differently than POWERn. I don't remember specific values of n though, and I could be misremembering...) So is this the failure mode? - a decrementer interrupt is delivered - guest does *not* issue mtdec to clear it (ppc64's lazy interrupt disabling?) - guest expects a second decrementer interrupt, but KVM doesn't deliver one In that case, it seems like the real fix would be something like this: void kvmppc_emulate_dec(struct kvm_vcpu *vcpu) { unsigned long dec_nsec; pr_debug(mtDEC: %x\n, vcpu-arch.dec); #ifdef CONFIG_PPC64 /* POWER4+ triggers a dec interrupt if the value is 0 */ if (vcpu-arch.dec 0x8000) { hrtimer_try_to_cancel(vcpu-arch.dec_timer); kvmppc_core_queue_dec(vcpu); + /* keep queuing interrupts until guest clears high MSR bit */ + hrtimer_start(vcpu-arch.dec_timer, ktime_set(0, 100), + HRTIMER_MODE_REL); return; } #endif -Hollis -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] Improve DEC handling
On Mon, Dec 21, 2009 at 10:13 AM, Hollis Blanchard hol...@penguinppc.org wrote: void kvmppc_emulate_dec(struct kvm_vcpu *vcpu) { unsigned long dec_nsec; pr_debug(mtDEC: %x\n, vcpu-arch.dec); #ifdef CONFIG_PPC64 /* POWER4+ triggers a dec interrupt if the value is 0 */ if (vcpu-arch.dec 0x8000) { hrtimer_try_to_cancel(vcpu-arch.dec_timer); kvmppc_core_queue_dec(vcpu); + /* keep queuing interrupts until guest clears high MSR bit */ + hrtimer_start(vcpu-arch.dec_timer, ktime_set(0, 100), + HRTIMER_MODE_REL); return; } #endif Of course, removing the hardcoded 100-ns timer would be better, and indeed we can do that. What we *really* want is to key off of MSR[EE] changes (there's no point in queuing anything until then). So why not move your AGGRESSIVE_DEC check into Book 3S's kvmppc_set_msr()? -Hollis -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Change PowerPC KVM maintainer
On Sun, Dec 20, 2009 at 1:24 PM, Alexander Graf ag...@suse.de wrote: Progress on KVM for Embedded PowerPC has stalled, but for Book3S there's quite a lot of work to do and going on. So in agreement with Hollis and Avi, we should switch maintainers for PowerPC. I'll still demand Acks from Hollis for code that changes BookE parts when I can't say for sure if the change is ok. Signed-off-by: Alexander Graf ag...@suse.de Acked-by: Hollis Blanchard hol...@penguinppc.org -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Change PowerPC KVM maintainer
On Sun, Dec 20, 2009 at 1:24 PM, Alexander Graf ag...@suse.de wrote: Progress on KVM for Embedded PowerPC has stalled, but for Book3S there's quite a lot of work to do and going on. So in agreement with Hollis and Avi, we should switch maintainers for PowerPC. I'll still demand Acks from Hollis for code that changes BookE parts when I can't say for sure if the change is ok. Signed-off-by: Alexander Graf ag...@suse.de Acked-by: Hollis Blanchard hol...@penguinppc.org -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM-PPC: Fix mtsrin in book3s_64 mmu
On Sat, Dec 19, 2009 at 9:07 AM, Alexander Graf ag...@suse.de wrote: We were shifting the Ks/Kp/N bits one bit too far on mtsrin. It took me some time to figure that out, so I also put in some debugging and a comment explaining the conversion. This fixes current OpenBIOS boot on PPC64 KVM. Signed-off-by: Alexander Graf ag...@suse.de Acked-by: Hollis Blanchard hol...@penguinppc.org -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM-PPC: Fix mtsrin in book3s_64 mmu
On Sat, Dec 19, 2009 at 9:07 AM, Alexander Graf ag...@suse.de wrote: We were shifting the Ks/Kp/N bits one bit too far on mtsrin. It took me some time to figure that out, so I also put in some debugging and a comment explaining the conversion. This fixes current OpenBIOS boot on PPC64 KVM. Signed-off-by: Alexander Graf ag...@suse.de Acked-by: Hollis Blanchard hol...@penguinppc.org -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/27] Add KVM support for Book3s_64 (PPC64) hosts v5
On Sun, 2009-10-25 at 15:01 +0200, Avi Kivity wrote: On 10/23/2009 02:33 AM, Hollis Blanchard wrote: On Wed, 2009-10-21 at 17:03 +0200, Alexander Graf wrote: KVM for PowerPC only supports embedded cores at the moment. While it makes sense to virtualize on small machines, it's even more fun to do so on big boxes. So I figured we need KVM for PowerPC64 as well. This patchset implements KVM support for Book3s_64 hosts and guest support for Book3s_64 and G3/G4. Acked-by: Hollis Blanchardholl...@us.ibm.com Avi, please apply these patches I still need acks for the arch/powerpc/{kernel,mm} bits, simple as they are, from the powerpc maintainers. OK, BenH says they're on his todo list. In the meantime, please apply patch #2, because it fixes the broken qemu build. -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/27] Add KVM support for Book3s_64 (PPC64) hosts v5
On Mon, 2009-10-26 at 18:06 -0500, Olof Johansson wrote: Not sure which patch in the series this is needed for since I applied them all, but I got: CC arch/powerpc/kvm/timing.o arch/powerpc/kvm/timing.c:205: error: 'THIS_MODULE' undeclared here (not in a function) Signed-off-by: Olof Johansson o...@lixom.net diff --git a/arch/powerpc/kvm/timing.c b/arch/powerpc/kvm/timing.c index 2aa371e..7037855 100644 --- a/arch/powerpc/kvm/timing.c +++ b/arch/powerpc/kvm/timing.c @@ -23,6 +23,7 @@ #include linux/seq_file.h #include linux/debugfs.h #include linux/uaccess.h +#include linux/module.h #include asm/time.h #include asm-generic/div64.h For some reason, I'm not seeing this build break, but the patch is obviously correct. Acked-by: Hollis Blanchard holl...@us.ibm.com -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/27] Add KVM support for Book3s_64 (PPC64) hosts v5
On Sun, 2009-10-25 at 15:01 +0200, Avi Kivity wrote: On 10/23/2009 02:33 AM, Hollis Blanchard wrote: On Wed, 2009-10-21 at 17:03 +0200, Alexander Graf wrote: KVM for PowerPC only supports embedded cores at the moment. While it makes sense to virtualize on small machines, it's even more fun to do so on big boxes. So I figured we need KVM for PowerPC64 as well. This patchset implements KVM support for Book3s_64 hosts and guest support for Book3s_64 and G3/G4. Acked-by: Hollis Blanchardholl...@us.ibm.com Avi, please apply these patches I still need acks for the arch/powerpc/{kernel,mm} bits, simple as they are, from the powerpc maintainers. OK, BenH says they're on his todo list. In the meantime, please apply patch #2, because it fixes the broken qemu build. -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/27] Add KVM support for Book3s_64 (PPC64) hosts v5
On Mon, 2009-10-26 at 18:06 -0500, Olof Johansson wrote: Not sure which patch in the series this is needed for since I applied them all, but I got: CC arch/powerpc/kvm/timing.o arch/powerpc/kvm/timing.c:205: error: 'THIS_MODULE' undeclared here (not in a function) Signed-off-by: Olof Johansson o...@lixom.net diff --git a/arch/powerpc/kvm/timing.c b/arch/powerpc/kvm/timing.c index 2aa371e..7037855 100644 --- a/arch/powerpc/kvm/timing.c +++ b/arch/powerpc/kvm/timing.c @@ -23,6 +23,7 @@ #include linux/seq_file.h #include linux/debugfs.h #include linux/uaccess.h +#include linux/module.h #include asm/time.h #include asm-generic/div64.h For some reason, I'm not seeing this build break, but the patch is obviously correct. Acked-by: Hollis Blanchard holl...@us.ibm.com -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/27] Add KVM support for Book3s_64 (PPC64) hosts v5
On Wed, 2009-10-21 at 17:03 +0200, Alexander Graf wrote: KVM for PowerPC only supports embedded cores at the moment. While it makes sense to virtualize on small machines, it's even more fun to do so on big boxes. So I figured we need KVM for PowerPC64 as well. This patchset implements KVM support for Book3s_64 hosts and guest support for Book3s_64 and G3/G4. Acked-by: Hollis Blanchard holl...@us.ibm.com Avi, please apply these patches, and one more (unrelated) to fix the Book E build that I will send in just a moment. -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/27] Add KVM support for Book3s_64 (PPC64) hosts v5
On Wed, 2009-10-21 at 17:03 +0200, Alexander Graf wrote: KVM for PowerPC only supports embedded cores at the moment. While it makes sense to virtualize on small machines, it's even more fun to do so on big boxes. So I figured we need KVM for PowerPC64 as well. This patchset implements KVM support for Book3s_64 hosts and guest support for Book3s_64 and G3/G4. Acked-by: Hollis Blanchard holl...@us.ibm.com Avi, please apply these patches, and one more (unrelated) to fix the Book E build that I will send in just a moment. -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] kvmppc: Fix BUILD_BUG_ON condition
The old BUILD_BUG_ON implementation didn't work with __builtin_constant_p(). Fixing that revealed this test had been inverted for a long time without anybody noticing... Signed-off-by: Hollis Blanchard holl...@us.ibm.com diff --git a/arch/powerpc/kvm/timing.h b/arch/powerpc/kvm/timing.h --- a/arch/powerpc/kvm/timing.h +++ b/arch/powerpc/kvm/timing.h @@ -48,7 +48,7 @@ static inline void kvmppc_set_exit_type( static inline void kvmppc_account_exit_stat(struct kvm_vcpu *vcpu, int type) { /* type has to be known at build time for optimization */ - BUILD_BUG_ON(__builtin_constant_p(type)); + BUILD_BUG_ON(!__builtin_constant_p(type)); switch (type) { case EXT_INTR_EXITS: vcpu-stat.ext_intr_exits++; -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: linux-next: tree build failure
On Thu, 2009-10-15 at 08:27 +0100, Jan Beulich wrote: Hollis Blanchard holl...@us.ibm.com 15.10.09 00:57 On Fri, 2009-10-09 at 12:14 -0700, Hollis Blanchard wrote: Rusty's version of BUILD_BUG_ON() does indeed fix the build break, and also exposes the bug in kvmppc_account_exit_stat(). So to recap: original: built but didn't work Jan's: doesn't build Rusty's: builds and works Where do you want to go from here? Jan, what are your thoughts? Your BUILD_BUG_ON patch has broken the build, and we still need to fix it. My perspective is that it just uncovered already existing brokenness. And honestly, I won't be able to get to look into this within the next days. (And btw., when I run into issues with other people's code changes, quite frequently I'm told to propose a patch, so I'm also having some philosophical problem understanding why I can't simply expect the same when people run into issues with changes I made, especially in cases like this where it wasn't me introducing the broken code.) So, if this can wait for a couple of days, I can try to find time to look into this. Otherwise, I'd rely on someone running into the actual issue to implement a solution. Sorry, I thought it was clear, but to be more explicit: I propose the following patch, which replaces the current BUILD_BUG_ON implementation with Rusty's version. diff --git a/include/linux/kernel.h b/include/linux/kernel.h --- a/include/linux/kernel.h +++ b/include/linux/kernel.h @@ -677,18 +677,19 @@ struct sysinfo { char _f[20-2*sizeof(long)-sizeof(int)]; /* Padding: libc5 uses this.. */ }; -/* Force a compilation error if condition is true */ -#define BUILD_BUG_ON(condition) ((void)BUILD_BUG_ON_ZERO(condition)) - -/* Force a compilation error if condition is constant and true */ -#define MAYBE_BUILD_BUG_ON(cond) ((void)sizeof(char[1 - 2 * !!(cond)])) - -/* Force a compilation error if condition is true, but also produce a - result (of value 0 and type size_t), so the expression can be used - e.g. in a structure initializer (or where-ever else comma expressions - aren't permitted). */ -#define BUILD_BUG_ON_ZERO(e) (sizeof(struct { int:-!!(e); })) -#define BUILD_BUG_ON_NULL(e) ((void *)sizeof(struct { int:-!!(e); })) +#ifndef __OPTIMIZE__ +#define BUILD_BUG_ON(condition) ((void)sizeof(char[1 - 2*!!(condition)])) +#else +/* If it's a constant, catch it at compile time, otherwise at link time. */ +extern int __build_bug_on_failed; +#define BUILD_BUG_ON_ZERO(e) (sizeof(char[1 - 2 * !!(e)]) - 1) +#define BUILD_BUG_ON(condition) \ + do { \ + ((void)sizeof(char[1 - 2*!!(condition)])); \ + if (condition) __build_bug_on_failed = 1; \ + } while(0) +#define MAYBE_BUILD_BUG_ON(condition) BUILD_BUG_ON(condition) +#endif /* Trap pasters of __FUNCTION__ at compile-time */ #define __FUNCTION__ (__func__) -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: linux-next: tree build failure
On Tue, 2009-10-20 at 11:42 +1030, Rusty Russell wrote: On Tue, 20 Oct 2009 04:49:29 am Hollis Blanchard wrote: On Thu, 2009-10-15 at 08:27 +0100, Jan Beulich wrote: My perspective is that it just uncovered already existing brokenness. Sorry, I thought it was clear, but to be more explicit: I propose the following patch, which replaces the current BUILD_BUG_ON implementation with Rusty's version. OK, I switched my brain back on. Yeah, I agree: we may still want BUILD_OR_RUNTIME_BUG_ON one day, but I like this. It's just missing the giant comment that it needs :) /** * BUILD_BUG_ON - break compile if a condition is true. * @cond: the condition which the compiler should know is false. * * If you have some code which relies on certain constants being equal, or * other compile-time-evaluated condition, you should use BUILD_BUG_ON to * detect if someone changes it. * * The implementation uses gcc's reluctance to create a negative array, but * gcc (as of 4.4) only emits that error for obvious cases (eg. not arguments * to inline functions). So as a fallback we use the optimizer; if it can't * prove the condition is false, it will cause a link error on the undefined * __build_bug_on_failed. This error is less neat, and can be harder to * track down. */ Do you want to put together a signed-off patch Rusty? It's your code, so I don't feel comfortable doing that. Once we have that, can we remove the mysterious MAYBE_BUILD_BUG_ON statements introduced in previous patches? (Does it BUG or doesn't it??) -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: linux-next: tree build failure
On Fri, 2009-10-09 at 12:14 -0700, Hollis Blanchard wrote: Rusty's version of BUILD_BUG_ON() does indeed fix the build break, and also exposes the bug in kvmppc_account_exit_stat(). So to recap: original: built but didn't work Jan's: doesn't build Rusty's: builds and works Where do you want to go from here? Jan, what are your thoughts? Your BUILD_BUG_ON patch has broken the build, and we still need to fix it. -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: linux-next: tree build failure
Rusty's version of BUILD_BUG_ON() does indeed fix the build break, and also exposes the bug in kvmppc_account_exit_stat(). So to recap: original: built but didn't work Jan's: doesn't build Rusty's: builds and works Where do you want to go from here? -- Hollis Blanchard IBM Linux Technology Center On Mon, 2009-10-05 at 07:58 +0100, Jan Beulich wrote: Hollis Blanchard holl...@us.ibm.com 02.10.09 17:48 On Wed, 2009-09-30 at 07:35 +0100, Jan Beulich wrote: The one Rusty suggested the other day may help here. I don't like it as a drop-in replacement for BUILD_BUG_ON() though (due to it deferring the error generated to the linking stage), I'd rather view this as an improvement to MAYBE_BUILD_BUG_ON() (which should then be used here). Can you be more specific? I have no idea what Rusty suggested where. I can't even guess what I'm attaching Rusty's response I was referring to. MAYBE_BUILD_BUG_ON() is supposed to do (sounds like a terrible name). Agreed - but presumably better than just deleting the bogus instances altogether... Jan email message attachment Forwarded Message From: Rusty Russell ru...@rustcorp.com.au To: Jan Beulich jbeul...@novell.com Cc: linux-ker...@vger.kernel.org Subject: Re: [PATCH] fix BUILD_BUG_ON() and a couple of bogus uses of it Date: Wed, 23 Sep 2009 10:27:00 +0930 On Wed, 19 Aug 2009 01:29:25 am Jan Beulich wrote: gcc permitting variable length arrays makes the current construct used for BUILD_BUG_ON() useless, as that doesn't produce any diagnostic if the controlling expression isn't really constant. Instead, this patch makes it so that a bit field gets used here. Consequently, those uses where the condition isn't really constant now also need fixing. Note that in the gfp.h, kmemcheck.h, and virtio_config.h cases MAYBE_BUILD_BUG_ON() really just serves documentation purposes - even if the expression is compile time constant (__builtin_constant_p() yields true), the array is still deemed of variable length by gcc, and hence the whole expression doesn't have the intended effect. Signed-off-by: Jan Beulich jbeul...@novell.com We used to use an undefined symbol here; diagnostics are worse but it catches more stuff. Perhaps a hybrid is the way to go? #ifndef __OPTIMIZE__ #define BUILD_BUG_ON(condition) ((void)sizeof(char[1 - 2*!!(condition)])) #else /* If it's a constant, catch it at compile time, otherwise at link time. */ extern int __build_bug_on_failed; #define BUILD_BUG_ON(condition) \ do {\ ((void)sizeof(char[1 - 2*!!(condition)])); \ if (condition) __build_bug_on_failed = 1; \ } while(0) #endif Thanks, Rusty. -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 15/27] Add mfdec emulation
On Tue, 2009-09-29 at 10:18 +0200, Alexander Graf wrote: We support setting the DEC to a certain value right now. Doing that basically triggers the CPU local timer. But there's also an mfdec command that enabled the OS to read the decrementor. This is required at least by all desktop and server PowerPC Linux kernels. It can't really hurt to allow embedded ones to do it as well though. Signed-off-by: Alexander Graf ag...@suse.de --- arch/powerpc/kvm/emulate.c | 13 - 1 files changed, 12 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c index 7737146..50d411d 100644 --- a/arch/powerpc/kvm/emulate.c +++ b/arch/powerpc/kvm/emulate.c @@ -66,12 +66,14 @@ void kvmppc_emulate_dec(struct kvm_vcpu *vcpu) { + unsigned long nr_jiffies; + if (vcpu-arch.tcr TCR_DIE) { /* The decrementer ticks at the same rate as the timebase, so * that's how we convert the guest DEC value to the number of * host ticks. */ - unsigned long nr_jiffies; + vcpu-arch.dec_jiffies = mftb(); nr_jiffies = vcpu-arch.dec / tb_ticks_per_jiffy; mod_timer(vcpu-arch.dec_timer, get_jiffies_64() + nr_jiffies); @@ -211,6 +213,15 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct kvm_vcpu *vcpu) /* Note: SPRG4-7 are user-readable, so we don't get * a trap. */ + case SPRN_DEC: + { + u64 jd = mftb() - vcpu-arch.dec_jiffies; + vcpu-arch.gpr[rt] = vcpu-arch.dec - jd; +#ifdef DEBUG_EMUL + printk(KERN_INFO mfDEC: %x - %llx = %lx\n, vcpu-arch.dec, jd, vcpu-arch.gpr[rt]); +#endif + break; + } default: emulated = kvmppc_core_emulate_mfspr(vcpu, sprn, rt); if (emulated == EMULATE_FAIL) { mftb() doesn't exist for ppc32, so we'll need to use the get_tb() wrapper instead. -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 02/27] Pass PVR in sregs
On Tue, 2009-09-29 at 10:17 +0200, Alexander Graf wrote: Right now sregs is unused on PPC, so we can use it for initialization of the CPU. KVM on BookE always virtualizes the host CPU. On Book3s we go a step further and take the PVR from userspace that tells us what kind of CPU we are supposed to virtualize, because we support Book3s_32 and Book3s_64 guests. In order to get that information, we use the sregs ioctl, because we don't want to reset the guest CPU on every normal register set. Signed-off-by: Alexander Graf ag...@suse.de --- arch/powerpc/include/asm/kvm.h |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/arch/powerpc/include/asm/kvm.h b/arch/powerpc/include/asm/kvm.h index bb2de6a..b82bd68 100644 --- a/arch/powerpc/include/asm/kvm.h +++ b/arch/powerpc/include/asm/kvm.h @@ -46,6 +46,8 @@ struct kvm_regs { }; struct kvm_sregs { + __u64 pvr; + char pad[1016]; }; struct kvm_fpu { Architecturally, PVR is 32 bits, even for PPC64. Is there a reason you want it to be 64 bits here? (I can understand just picking 64 for registers that could be either size, but that's not this case.) -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 25/27] Fix trace.h
On Tue, 2009-09-29 at 10:18 +0200, Alexander Graf wrote: It looks like the variable pc is defined. At least the current code always failed on me stating that pc is already defined somewhere else. Let's use _pc instead, because that doesn't collide. Is this the right approach? Does it break on 440 too? If not, why not? Signed-off-by: Alexander Graf ag...@suse.de --- arch/powerpc/kvm/trace.h |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kvm/trace.h b/arch/powerpc/kvm/trace.h index 67f219d..a8e8400 100644 --- a/arch/powerpc/kvm/trace.h +++ b/arch/powerpc/kvm/trace.h @@ -12,8 +12,8 @@ * Tracepoint for guest mode entry. */ TRACE_EVENT(kvm_ppc_instr, - TP_PROTO(unsigned int inst, unsigned long pc, unsigned int emulate), - TP_ARGS(inst, pc, emulate), + TP_PROTO(unsigned int inst, unsigned long _pc, unsigned int emulate), + TP_ARGS(inst, _pc, emulate), TP_STRUCT__entry( __field(unsigned int, inst) @@ -23,7 +23,7 @@ TRACE_EVENT(kvm_ppc_instr, TP_fast_assign( __entry-inst = inst; - __entry-pc = pc; + __entry-pc = _pc; __entry-emulate= emulate; ), After much digging, I managed to actually enable CONFIG_TRACEPOINTS. However, I still don't get any build errors from this code. Maybe you could paste the full gcc output? -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Use Little Endian for Dirty Bitmap
On Mon, 2009-07-27 at 12:38 +0200, Alexander Graf wrote: We currently use host endian long types to store information in the dirty bitmap. This works reasonably well on Little Endian targets, because the u32 after the first contains the next 32 bits. On Big Endian this breaks completely though, forcing us to be inventive here. So Ben suggested to always use Little Endian, which looks reasonable. We only have dirty bitmap implemented in Little Endian targets so far and since PowerPC would be the first Big Endian platform, we can just as well switch to Little Endian always with little effort without breaking existing targets. Signed-off-by: Alexander Graf ag...@suse.de --- virt/kvm/kvm_main.c |5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 17d8688..3482ad1 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -48,6 +48,7 @@ #include asm/io.h #include asm/uaccess.h #include asm/pgtable.h +#include asm-generic/bitops/le.h #ifdef KVM_COALESCED_MMIO_PAGE_OFFSET #include coalesced_mmio.h @@ -1656,8 +1657,8 @@ void mark_page_dirty(struct kvm *kvm, gfn_t gfn) unsigned long rel_gfn = gfn - memslot-base_gfn; /* avoid RMW */ - if (!test_bit(rel_gfn, memslot-dirty_bitmap)) - set_bit(rel_gfn, memslot-dirty_bitmap); + if (!generic_test_le_bit(rel_gfn, memslot-dirty_bitmap)) + generic___set_le_bit(rel_gfn, memslot-dirty_bitmap); } } I don't think I've ever exercised the dirty bitmap code, and I don't really have an opinion. Avi? -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Use Little Endian for Dirty Bitmap
On Mon, 2009-07-27 at 12:38 +0200, Alexander Graf wrote: We currently use host endian long types to store information in the dirty bitmap. This works reasonably well on Little Endian targets, because the u32 after the first contains the next 32 bits. On Big Endian this breaks completely though, forcing us to be inventive here. So Ben suggested to always use Little Endian, which looks reasonable. We only have dirty bitmap implemented in Little Endian targets so far and since PowerPC would be the first Big Endian platform, we can just as well switch to Little Endian always with little effort without breaking existing targets. Signed-off-by: Alexander Graf ag...@suse.de --- virt/kvm/kvm_main.c |5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 17d8688..3482ad1 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -48,6 +48,7 @@ #include asm/io.h #include asm/uaccess.h #include asm/pgtable.h +#include asm-generic/bitops/le.h #ifdef KVM_COALESCED_MMIO_PAGE_OFFSET #include coalesced_mmio.h @@ -1656,8 +1657,8 @@ void mark_page_dirty(struct kvm *kvm, gfn_t gfn) unsigned long rel_gfn = gfn - memslot-base_gfn; /* avoid RMW */ - if (!test_bit(rel_gfn, memslot-dirty_bitmap)) - set_bit(rel_gfn, memslot-dirty_bitmap); + if (!generic_test_le_bit(rel_gfn, memslot-dirty_bitmap)) + generic___set_le_bit(rel_gfn, memslot-dirty_bitmap); } } I don't think I've ever exercised the dirty bitmap code, and I don't really have an opinion. Avi? -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/23] Pass PVR in sregs
Hi Avi, would you apply this patch? Looks like the corresponding qemu patch went in a while ago, so the qemu build has been broken for some time. Signed-off-by: Hollis Blanchard holl...@us.ibm.com -- Hollis Blanchard IBM Linux Technology Center On Thu, 2009-07-16 at 15:29 +0200, Alexander Graf wrote: Right now sregs is unused on PPC, so we can use it for initialization of the CPU. KVM on BookE always virtualizes the host CPU. On Book3s we go a step further and take the PVR from userspace that tells us what kind of CPU we are supposed to virtualize, because we support Book3s_32 and Book3s_64 guests. In order to get that information, we use the sregs ioctl, because we don't want to reset the guest CPU on every normal register set. Signed-off-by: Alexander Graf ag...@suse.de --- arch/powerpc/include/asm/kvm.h |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/arch/powerpc/include/asm/kvm.h b/arch/powerpc/include/asm/kvm.h index bb2de6a..b82bd68 100644 --- a/arch/powerpc/include/asm/kvm.h +++ b/arch/powerpc/include/asm/kvm.h @@ -46,6 +46,8 @@ struct kvm_regs { }; struct kvm_sregs { + __u64 pvr; + char pad[1016]; }; struct kvm_fpu { -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PULL] Add KVM support for Book3s_64 (PPC64) hosts v3
On Wed, 2009-09-02 at 15:40 +1000, Benjamin Herrenschmidt wrote: On Fri, 2009-07-24 at 18:30 +0200, Alexander Graf wrote: KVM for PowerPC only supports embedded cores at the moment. While it makes sense to virtualize on small machines, it's even more fun to do so on big boxes. So I figured we need KVM for PowerPC64 as well. This patchset implements KVM support for Book3s_64 hosts and guest support for Book3s_64 and G3/G4. To really make use of this, you will also need a modified version of qemu that can deal with KVM on desktop cores. I will send out patches for those later, but want to get feedback on the kernel side first. In the meanwhile, use the qemu version from http://www.powerkvm.org/powerkvm.git which already includes all required patches to run G3/G4 and G5 guests. The git pull request is good when the series is good to pull or for testing, but for review patches are nice :-) Yeah, I actually went looking for the v3 patchset last week, and gave up trying to extract it from git... -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/5] kvmppc: Translate eaddr for fsl_booke mmu
On Tue, 2009-08-04 at 17:36 +0800, Liu Yu wrote: Signed-off-by: Liu Yu yu@freescale.com --- target-ppc/helper.c | 17 +++-- 1 files changed, 15 insertions(+), 2 deletions(-) diff --git a/target-ppc/helper.c b/target-ppc/helper.c index 6eca2e5..07e56a4 100644 --- a/target-ppc/helper.c +++ b/target-ppc/helper.c @@ -22,6 +22,7 @@ #include string.h #include inttypes.h #include signal.h +#include linux/kvm.h #include cpu.h #include exec-all.h @@ -1325,8 +1326,20 @@ static always_inline int check_physical (CPUState *env, mmu_ctx_t *ctx, cpu_abort(env, MPC8xx MMU model is not implemented\n); break; case POWERPC_MMU_BOOKE_FSL: -/* XXX: TODO */ -cpu_abort(env, BookE FSL MMU model not implemented\n); +if (kvm_enabled()) { +struct kvm_translation tr; + +/* For now we only debug guest kernel */ +tr.linear_address = eaddr; +ret = kvm_vcpu_ioctl(env, KVM_TRANSLATE, tr); +if (ret 0) +return ret; + +ctx-raddr = tr.physical_address; +} else { +/* XXX: TODO */ +cpu_abort(env, BookE FSL MMU model not implemented\n); +} break; default: cpu_abort(env, Unknown or invalid MMU model\n); One objection: the comment is a little obscure. I think what you're really saying is in Linux guests, kernel addresses should always be covered by TLB1, which means for those addresses we can expect this ioctl to succeed. However, since you need to handle failures anyways, I think you should remove the comment entirely. Second, (and this isn't an objection but rather a question) do you have any better ideas for struct kvm_translation? It only really makes sense for x86. We don't need to stick with it. -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: kvm BookE and SPRGs
On Fri, 2009-07-10 at 19:17 +1000, Benjamin Herrenschmidt wrote: On Fri, 2009-07-10 at 17:15 +0800, Liu Yu-B13201 wrote: Sounds reasonable. There are some old patchset which implemented the binary patch as Ben described. http://marc.info/?l=kvm-ppcm=122154653905212w=2 http://marc.info/?l=kvm-ppcm=122154657905306w=2 Interesting. Any reason why that wasn't merged ? Basically because we ran out of manpower to maintain it. We didn't want to push PV changes into upstream Linux, useful only to us, and then disappear. -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kvm BookE and SPRGs
On Fri, 2009-07-10 at 18:10 +1000, Benjamin Herrenschmidt wrote: On Fri, 2009-07-10 at 16:31 +1000, Benjamin Herrenschmidt wrote: I was roaming through kernel usage of SPRGs and noticed a small detail in kvmppc for BookE ... any reason why in OP_31_XOP_MTSPR, you open coded the emulation of SPRG0..3, but 4...7 are handled in kvmppc_core_emulate_mtspr() ? It occurs to me that in fact for both MTSPR and MFSPR, the code should be moved into kvmppc_core_emulate_mtspr() and kvmppc_core_emulate_mfspr() for consistency. Also, from looking at the FSL BookE code, it seems that there is such a thing as SPRG9 (and so I suppose there must be an SPRG8 somewhere too), shouldn't we handle it too ? BTW. That leads me to another question (CC'ing Avi there too), which is what is the policy vs. para-virtualization ? IE. Are we ok with adding paravirt tricks to speed things up ? Yes, that's fine. We would rather not *require* paravirtualization though. A prime example I have in mind that could possibly help a lot here is to have a shared page mapped at -4K (at the top of the address space) when the guest is in supervisor mode only that hosts part of the current VCPU supervisor register state. ... The cost of course is an additional TLB entry for mapping that -4K page (but only when running guest kernel code). It was a net win when Christian implemented it last year. While the first access may miss in the TLB, these register accesses tend to come in bunches (i.e. the guest interrupt vectors). -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 01/23] Pass PVR in sregs
On Wed, 2009-07-08 at 10:28 +0800, Liu Yu-B13201 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Wednesday, July 08, 2009 7:23 AM To: Hollis Blanchard Cc: Avi Kivity; kvm-ppc@vger.kernel.org; a...@arndb.de; kw...@redhat.com Subject: Re: [PATCH 01/23] Pass PVR in sregs On 08.07.2009, at 00:50, Hollis Blanchard wrote: The PVR register is basically the equivalent of cpuid. It might make more sense to exit to qemu to handle those accesses. Today, PVR reads are emulated in-kernel. I actually use it in 970.c to find out which guest MMU to choose from. So even if we were to do it in userspace, we'd still need the information what CPU to create in the guest on the kernel side. Which in turn means we don't win anything from leaving the PVR emulation to userspace. Hmm, I don't remember where arch-vcpu.pvr is being set at all for 440 and e500... It used to be in some creation code - either general kvm or vcpu. But some recent patch removed vcpu-arch.pvr and made emulate.c do an mfspr(SPRN_PVR). Yes. Since the demand to emulate PVR for 440 and e500 is still vague. Directly return the real value can simplify the code, and latter patches can easily change it. The same thing goes for PIR. By the way, I don't like that PVR patch you sent to Avi (and he committed). It's my own fault for not reading more closely, but in the future could you send me patches that touch code which isn't e500-specific? -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/23] Pass PVR in sregs
On Tue, 2009-07-07 at 18:40 +0300, Avi Kivity wrote: On 07/07/2009 05:17 PM, Alexander Graf wrote: Right now sregs is unused on PPC, so we can use it for initialization of the CPU. KVM on BookE always virtualizes the host CPU. On PPC64 we go a step further and take the PVR from userspace that tells us what kind of CPU we are supposed to virtualize, because we support PPC32 and PPC64 guests. In order to get that information, we use the sregs ioctl, because we don't want to reset the guest CPU on every normal register set. Signed-off-by: Alexander Grafag...@suse.de --- arch/powerpc/include/asm/kvm.h |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/arch/powerpc/include/asm/kvm.h b/arch/powerpc/include/asm/kvm.h index bb2de6a..96b02cd 100644 --- a/arch/powerpc/include/asm/kvm.h +++ b/arch/powerpc/include/asm/kvm.h @@ -46,6 +46,7 @@ struct kvm_regs { }; struct kvm_sregs { + __u64 pvr; }; You can only do that if existing userspace never calls KVM_SET_SREGS. Hollis? It doesn't. Also, make sure to reserve a bunch of space in there so you if you forget something you can add it later. Agreed. The PVR register is basically the equivalent of cpuid. It might make more sense to exit to qemu to handle those accesses. Today, PVR reads are emulated in-kernel. Hmm, I don't remember where arch-vcpu.pvr is being set at all for 440 and e500... -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] remove ppc functions from callbacks
On Tue, 2009-06-16 at 09:34 -0400, Glauber Costa wrote: handle_powerpc_dcr_read() and handle_powerpc_dcr_write() are two powerpc specific functions that are called via libkvm callbacks. However, grepping the source code finds simply no use of them. This is probably due to the fact that powerpc now relies on a totally qemu upstream implementation of kvm, and does not need it anymore. Anyway, I'm providing this patch separatedly, so that if it breaks for whenever reason, we can identify a bisection point easily Signed-off-by: Glauber Costa glom...@redhat.com CC: Hollis Blanchard holl...@us.ibm.com Yup, this path is handled via kvm_arch_handle_exit() now. Thanks. Acked-by: Hollis Blanchard holl...@us.ibm.com -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/4] Introduce kvm_vcpu_is_bsp() function.
On Mon, 2009-06-08 at 14:20 +0300, Avi Kivity wrote: Gleb Natapov wrote: @@ -130,6 +130,7 @@ struct kvm { int nmemslots; struct kvm_memory_slot memslots[KVM_MEMORY_SLOTS + KVM_PRIVATE_MEM_SLOTS]; + struct kvm_vcpu *bsp_vcpu; struct kvm_vcpu *vcpus[KVM_MAX_VCPUS]; struct list_head vm_list; struct kvm_io_bus mmio_bus; The concept of bsp (boot processor) is limited IIUC to x86 and ia64 That is true. please wrap with HAVE_KVM_IRQCHIP (which is a close approximation). I don't know about that... I've definitely thought about implementing an in-kernel PIC for PowerPC. (That will make more sense as an optimization once the processors with hypervisor support start hitting the market.) -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: ppc: beyond ARRAY_SIZE of vcpu_44x-guest_tlb
On Wed, 2009-05-20 at 00:50 +0200, Roel Kluin wrote: Do not go beyond ARRAY_SIZE of vcpu_44x-guest_tlb Signed-off-by: Roel Kluin roel.kl...@gmail.com --- diff --git a/arch/powerpc/kvm/44x_tlb.c b/arch/powerpc/kvm/44x_tlb.c index 4a16f47..6093289 100644 --- a/arch/powerpc/kvm/44x_tlb.c +++ b/arch/powerpc/kvm/44x_tlb.c @@ -439,7 +439,7 @@ int kvmppc_44x_emul_tlbwe(struct kvm_vcpu *vcpu, u8 ra, u8 rs, u8 ws) unsigned int gtlb_index; gtlb_index = vcpu-arch.gpr[ra]; - if (gtlb_index KVM44x_GUEST_TLB_SIZE) { + if (gtlb_index = KVM44x_GUEST_TLB_SIZE) { printk(%s: index %d\n, __func__, gtlb_index); kvmppc_dump_vcpu(vcpu); return EMULATE_FAIL; Acked-by: Hollis Blanchard holl...@us.ibm.com Thanks. -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PowerPC page faults
On Monday 11 May 2009 17:17:53 Anthony Liguori wrote: Hollis Blanchard wrote: On Mon, 2009-05-11 at 12:54 -0500, Anthony Liguori wrote: For future ppcemb's, do you know if there is an equivalent of a PF exit type? Does the hardware squirrel away the faulting address somewhere and set PC to the start of the instruction? If so, no guest memory load should be required. Ahhh... you're saying that the address itself (or offset within a page) is the hypercall token, totally separate from IO emulation, and so we could ignore the access size. No, I'm not being nearly that clever. I started thinking you weren't, but then I realized you must be working several levels above me and just forgot to explain... :) I was suggesting that hardware virtualization support in future PPC systems might contain a mechanism to intercept a guest-mode TLB miss. If it did, it would be useful if that guest-mode TLB miss exit contained extra information somewhere that included the PC of the faulting instruction, the address response for the fault, and enough information to handle the fault without instruction decoding. I assume all MMIO comes from the same set of instructions in PPC? Something like ld/st instructions? Presumably all you need to know from instruction decoding is the destination register and whether it was a read or write? In addition to register source/target, we also need to know the access size of the memory reference. That information isn't stuffed into registers for us by hardware, and it's not in published specifications for future hardware. Now, if you wanted to define a hypercall as a byte access within a particular 4K page, where the register source/target is ignored, that could be interesting, but I don't know if that's relevant to this hypercall vs MMIO discussion. -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 0/3] generic hypercall support
On Mon, 2009-05-11 at 09:14 -0400, Gregory Haskins wrote: for request-response, this is generally for *every* packet since you cannot exploit buffering/deferring. Can you back up your claim that PPC has no difference in performance with an MMIO exit and a hypercall (yes, I understand PPC has no VT like instructions, but clearly there are ways to cause a trap, so presumably we can measure the difference between a PF exit and something more explicit). First, the PPC that KVM supports performs very poorly relatively speaking because it receives no hardware assistance So wouldn't that be making the case that it could use as much help as possible? I think he's referencing Ahmdal's Law here. While I'd agree, this is relevant only for the current KVM PowerPC implementations. I think it would be short-sighted to design an IO architecture around that. this is not the right place to focus wrt optimizations. Odd choice of words. I am advocating the opposite (broad solution to many arches and many platforms (i.e. hypervisors)) and therefore I am not focused on it (or really any one arch) at all per se. I am _worried_ however, that we could be overlooking PPC (as an example) if we ignore the disparity between MMIO and HC since other higher performance options are not available like PIO. The goal on this particular thread is to come up with an IO interface that works reasonably well across as many hypervisors as possible. MMIO/PIO do not appear to fit that bill (at least not without tunneling them over HCs) I haven't been following this conversation at all. With that in mind... AFAICS, a hypercall is clearly the higher-performing option, since you don't need the additional memory load (which could even cause a page fault in some circumstances) and instruction decode. That said, I'm willing to agree that this overhead is probably negligible compared to the IOp itself... Ahmdal's Law again. -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 0/3] generic hypercall support
On Sun, 2009-05-10 at 13:38 -0500, Anthony Liguori wrote: Gregory Haskins wrote: Can you back up your claim that PPC has no difference in performance with an MMIO exit and a hypercall (yes, I understand PPC has no VT like instructions, but clearly there are ways to cause a trap, so presumably we can measure the difference between a PF exit and something more explicit). First, the PPC that KVM supports performs very poorly relatively speaking because it receives no hardware assistance this is not the right place to focus wrt optimizations. And because there's no hardware assistance, there simply isn't a hypercall instruction. Are PFs the fastest type of exits? Probably not but I honestly have no idea. I'm sure Hollis does though. Memory load from the guest context (for instruction decoding) is a *very* poorly performing path on most PowerPC, even considering server PowerPC with hardware virtualization support. No, I don't have any data for you, but switching the hardware MMU contexts requires some heavyweight synchronization instructions. Page faults are going to have tremendously different performance characteristics on PPC too because it's a software managed TLB. There's no page table lookup like there is on x86. To clarify, software-managed TLBs are only found in embedded PowerPC. Server and classic PowerPC use hash tables, which are a third MMU type. -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PowerPC page faults
On Mon, 2009-05-11 at 12:54 -0500, Anthony Liguori wrote: For future ppcemb's, do you know if there is an equivalent of a PF exit type? Does the hardware squirrel away the faulting address somewhere and set PC to the start of the instruction? If so, no guest memory load should be required. Ahhh... you're saying that the address itself (or offset within a page) is the hypercall token, totally separate from IO emulation, and so we could ignore the access size. I guess it looks like this: page fault vector: if (faulting_address PAGE_MASK) == vcpu-hcall_page handle_hcall(faulting_address ~PAGE_MASK) else if (faulting_address is IO) emulate_io(faulting_address) else handle_pagefault(faulting_address) Testing for hypercalls in the page fault handler path would add some overhead, and on processors with software-managed TLBs, the page fault path is *very* hot. Implementing the above pseudocode wouldn't be ideal, especially because Power processors with hardware virtualization support have a separate vector for hypercalls. However, I suspect it wouldn't be a show-stopper from a performance point of view. Note that other Power virtualization solutions (hypervisors from IBM, Sony, and Toshiba) use the dedicated hypercall instruction and interrupt vector, which after all is how the hardware was designed. To my knowledge, they also don't do IO emulation, so they avoid both conditionals in the above psuedocode. -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 1/3] add generic hypercall support
On Tue, 2009-05-05 at 09:24 -0400, Gregory Haskins wrote: *) PIO is more direct than MMIO, but it poses other problems such as: a) can have a small limited address space (x86 is 2^16) b) is a narrow-band interface (one 8, 16, 32, 64 bit word at a time) c) not available on all archs (PCI mentions ppc as problematic) and is therefore recommended to avoid. Side note: I don't know what PCI has to do with this, and problematic isn't the word I would use. ;) As far as I know, x86 is the only still-alive architecture that implements instructions for a separate IO space (not even ia64 does). -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] Revert Sync idcache after emualted DMA operations foria64
On Mon, 2009-05-04 at 09:39 +0800, Zhang, Xiantao wrote: Could you explain why this patch breaks the powerpc build? qemu_sync_icache has the definition for non-ai64 case, so shoudn't break any arch-specific build. cutils.o: In function `qemu_iovec_from_buffer': /home/hollisb/source/qemu-kvm.git/cutils.c:175: undefined reference to `qemu_cache_conf' /home/hollisb/source/qemu-kvm.git/cutils.c:171: undefined reference to `qemu_cache_conf' cutils.o: In function `qemu_iovec_from_buffer': /home/hollisb/source/qemu-kvm.git/cache-utils.h:18: undefined reference to `qemu_cache_conf' However, to restate my point: the build error is not the biggest problem with this patch. The bigger problems are all the other issues I've repeatedly described. The build break is the icing on the cake. -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: qemu-kvm.git now live
On Sat, 2009-05-02 at 10:52 +0300, Avi Kivity wrote: Hollis Blanchard wrote: In that case it's sufficient to have the build system use the upstream kvm integration (CONFIG_KVM) rather than the qemu-kvm integration (USE_KVM). OK, I give up... how is this supposed to work? Nobody ever sets CONFIG_KVM or KVM_UPSTREAM, but there are a couple tests for it. Glauber once sent a patch related to that, but I don't see how it helps. KVM_UPSTREAM is just a marker to let us know which parts of upstream qemu/kvm integration conflict with qemu-kvm.git. OK, so where do I define KVM_UPSTREAM? Also, where do I define CONFIG_KVM? I would expect the configure script to do that, but apparently it does not. -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 04/04] qemu-kvm: other archs should maintain memory mappingalso.
On Mon, 2009-05-04 at 11:25 +0200, Jes Sorensen wrote: Avi Kivity wrote: Jes Sorensen wrote: +int destroy_region_works = 0; Global name, prefix with kvm_. Does it actually need to be global? Gone, now local to qemu-kvm-x86.c. I moved the initializer into kvm_arch_create_context() instead. The header depends on target_phys_addr_t, so it must include whatever defines it. Added an #include cpu-all.h which defines it. Missing other archs... Instead of duplicating this for every arch, you can have a #define that tells you if you want non-trivial arch definitions, and supply the trivial definitions in qemu-kvm.h. Done, I also added a PPC header file - which may or may not be wanted at this point. You can just cut it out if you don't think it should be added. I don't understand the code being moved, but I guess I don't want it, so your patch is fine with me. (Wtf are those magic addresses? And not a single comment?? Aren't we better than this?) -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 04/04] qemu-kvm: other archs should maintain memory mappingalso.
On Mon, 2009-05-04 at 11:25 +0200, Jes Sorensen wrote: Avi Kivity wrote: Jes Sorensen wrote: +int destroy_region_works = 0; Global name, prefix with kvm_. Does it actually need to be global? Gone, now local to qemu-kvm-x86.c. I moved the initializer into kvm_arch_create_context() instead. The header depends on target_phys_addr_t, so it must include whatever defines it. Added an #include cpu-all.h which defines it. Missing other archs... Instead of duplicating this for every arch, you can have a #define that tells you if you want non-trivial arch definitions, and supply the trivial definitions in qemu-kvm.h. Done, I also added a PPC header file - which may or may not be wanted at this point. You can just cut it out if you don't think it should be added. I don't understand the code being moved, but I guess I don't want it, so your patch is fine with me. (Wtf are those magic addresses? And not a single comment?? Aren't we better than this?) -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Revert Sync idcache after emualted DMA operations for ia64
This reverts commit 9dc99a28236161a5a1b4c58f1e9c4ec6179cb976. Aside from the other issues discussed on kvm-devel, this commit breaks the PowerPC build. Signed-off-by: Hollis Blanchard holl...@us.ibm.com --- Bad mailing list address on my previous mail. cache-utils.h | 14 -- cutils.c |3 --- dma-helpers.c | 12 exec.c|3 --- 4 files changed, 0 insertions(+), 32 deletions(-) diff --git a/cache-utils.h b/cache-utils.h index 2a07fbd..b45fde4 100644 --- a/cache-utils.h +++ b/cache-utils.h @@ -33,22 +33,8 @@ static inline void flush_icache_range(unsigned long start, unsigned long stop) asm volatile (sync : : : memory); asm volatile (isync : : : memory); } -#define qemu_sync_idcache flush_icache_range -#else -#ifdef __ia64__ -static inline void qemu_sync_idcache(unsigned long start, unsigned long stop) -{ -while (start stop) { - asm volatile (fc %0 :: r(start)); - start += 32; -} -asm volatile (;;sync.i;;srlz.i;;); -} #else -static inline void qemu_sync_idcache(unsigned long start, unsigned long stop) {} -#endif - #define qemu_cache_utils_init(envp) do { (void) (envp); } while (0) #endif diff --git a/cutils.c b/cutils.c index 4bf4fcd..5b36cc6 100644 --- a/cutils.c +++ b/cutils.c @@ -23,7 +23,6 @@ */ #include qemu-common.h #include host-utils.h -#include cache-utils.h #include assert.h void pstrcpy(char *buf, int buf_size, const char *str) @@ -216,8 +215,6 @@ void qemu_iovec_from_buffer(QEMUIOVector *qiov, const void *buf, size_t count) if (copy qiov-iov[i].iov_len) copy = qiov-iov[i].iov_len; memcpy(qiov-iov[i].iov_base, p, copy); -qemu_sync_idcache((unsigned long)qiov-iov[i].iov_base, - (unsigned long)(qiov-iov[i].iov_base + copy)); p += copy; count -= copy; } diff --git a/dma-helpers.c b/dma-helpers.c index f71ca3b..f9eb224 100644 --- a/dma-helpers.c +++ b/dma-helpers.c @@ -9,7 +9,6 @@ #include dma.h #include block_int.h -#include cache-utils.h static AIOPool dma_aio_pool; @@ -138,8 +137,6 @@ static BlockDriverAIOCB *dma_bdrv_io( BlockDriverCompletionFunc *cb, void *opaque, int is_write) { -int i; -QEMUIOVector *qiov; DMAAIOCB *dbs = qemu_aio_get_pool(dma_aio_pool, bs, cb, opaque); dbs-acb = NULL; @@ -152,15 +149,6 @@ static BlockDriverAIOCB *dma_bdrv_io( dbs-bh = NULL; qemu_iovec_init(dbs-iov, sg-nsg); dma_bdrv_cb(dbs, 0); - -if (!is_write) { -qiov = dbs-iov; -for (i = 0; i qiov-niov; ++i) { - qemu_sync_idcache((unsigned long)qiov-iov[i].iov_base, - (unsigned long)(qiov-iov[i].iov_base + qiov-iov[i].iov_len)); - } -} - if (!dbs-acb) { qemu_aio_release(dbs); return NULL; diff --git a/exec.c b/exec.c index 16d3cf8..262c72f 100644 --- a/exec.c +++ b/exec.c @@ -3400,9 +3400,6 @@ void cpu_physical_memory_unmap(void *buffer, target_phys_addr_t len, addr1 += l; access_len -= l; } -if (kvm_enabled()) - flush_icache_range((unsigned long)buffer, - (unsigned long)buffer + access_len); } return; } -- 1.6.0.6 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: qemu-kvm.git now live
On Wed, 2009-04-29 at 11:29 +0300, Avi Kivity wrote: * configure completely ignores --kerneldir and only uses kvm/kernel headers instead. That's intentional. Huh? If --kerneldir does nothing, why does it exist? * The headers in kvm/kernel/arch/foo seem to be the important ones, but they have odd ifdefs at the top and I'm not sure how they should be generated. They were generated by the old 'make sync' to remove CONFIG_ dependencies. I guess a better way to generate them is a 'make headers-install' from the kernel tree and grab the results. 'make headers_install' only produces include/asm/kvm.h, which apparently is not sufficient. I can use output from an old 'make sync' for now, but obviously that doesn't help with future changes to these headers. Defining a process for updating those headers would be useful. I'll send patches separately. -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: qemu-kvm.git now live
On Wed, 2009-04-29 at 11:31 +0300, Avi Kivity wrote: Hollis Blanchard wrote: Since PPC is now supported in upstream QEMU, does it really matter if it works in qemu-kvm.git? I was going to take that position too, except Avi asked me specifically if some of the code inside TARGET_I386 ifdefs in qemu-kvm.c works for other architectures. In that case it's sufficient to have the build system use the upstream kvm integration (CONFIG_KVM) rather than the qemu-kvm integration (USE_KVM). OK, I give up... how is this supposed to work? Nobody ever sets CONFIG_KVM or KVM_UPSTREAM, but there are a couple tests for it. Glauber once sent a patch related to that, but I don't see how it helps. For reference, the actual error is about a hundred instances of e.g. /home/hollisb/source/qemu-kvm.git/vl.c:3393: undefined reference to `kvm_allowed' -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/5] [libkvm] Rename config-powerpc to config-ppc
Apparently $(ARCH) now holds the qemu meaning, rather than the KVM meaning. Signed-off-by: Hollis Blanchard holl...@us.ibm.com --- kvm/libkvm/config-powerpc.mak |4 kvm/libkvm/config-ppc.mak |4 2 files changed, 4 insertions(+), 4 deletions(-) delete mode 100644 kvm/libkvm/config-powerpc.mak create mode 100644 kvm/libkvm/config-ppc.mak diff --git a/kvm/libkvm/config-powerpc.mak b/kvm/libkvm/config-powerpc.mak deleted file mode 100644 index 091da37..000 --- a/kvm/libkvm/config-powerpc.mak +++ /dev/null @@ -1,4 +0,0 @@ - -LIBDIR := /lib - -libkvm-$(ARCH)-objs := libkvm-powerpc.o diff --git a/kvm/libkvm/config-ppc.mak b/kvm/libkvm/config-ppc.mak new file mode 100644 index 000..091da37 --- /dev/null +++ b/kvm/libkvm/config-ppc.mak @@ -0,0 +1,4 @@ + +LIBDIR := /lib + +libkvm-$(ARCH)-objs := libkvm-powerpc.o -- 1.6.0.6 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
PPC support for qemu-kvm
These patches fix a number of issues with PowerPC builds of qemu-kvm.git. However, even after applying these patches it still doesn't build, due to confusion with KVM_UPSTREAM and CONFIG_KVM. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/5] [qemu-kvm] Fix warning when__ia64__ is not defined.
Signed-off-by: Hollis Blanchard holl...@us.ibm.com --- kvm/libkvm/kvm-common.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/kvm/libkvm/kvm-common.h b/kvm/libkvm/kvm-common.h index 96361e8..591fb53 100644 --- a/kvm/libkvm/kvm-common.h +++ b/kvm/libkvm/kvm-common.h @@ -22,7 +22,7 @@ #define KVM_MAX_NUM_MEM_REGIONS 1u #define MAX_VCPUS 64 #define LIBKVM_S390_ORIGIN (0UL) -#elif __ia64__ +#elif defined(__ia64__) #define KVM_MAX_NUM_MEM_REGIONS 32u #define MAX_VCPUS 256 #else -- 1.6.0.6 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/5] [qemu-kvm] Fix missing prototype warning.
As far as I can see, kvm_destroy_memory_region_works() has nothing to do with KVM_CAP_DEVICE_ASSIGNMENT, so move the prototype outside that ifdef block. Signed-off-by: Hollis Blanchard holl...@us.ibm.com --- kvm/libkvm/libkvm.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/kvm/libkvm/libkvm.h b/kvm/libkvm/libkvm.h index ce6f054..c23d37b 100644 --- a/kvm/libkvm/libkvm.h +++ b/kvm/libkvm/libkvm.h @@ -739,6 +739,7 @@ int kvm_assign_irq(kvm_context_t kvm, int kvm_deassign_irq(kvm_context_t kvm, struct kvm_assigned_irq *assigned_irq); #endif +#endif /*! * \brief Determines whether destroying memory regions is allowed @@ -748,7 +749,6 @@ int kvm_deassign_irq(kvm_context_t kvm, * \param kvm Pointer to the current kvm_context */ int kvm_destroy_memory_region_works(kvm_context_t kvm); -#endif #ifdef KVM_CAP_DEVICE_DEASSIGNMENT /*! -- 1.6.0.6 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 5/5] patch add_powerpc_kvm_headers.diff
--- kvm/kernel/arch/powerpc/include/asm/kvm.h | 102 +++ kvm/kernel/arch/powerpc/include/asm/kvm_44x.h | 108 +++ kvm/kernel/arch/powerpc/include/asm/kvm_asm.h | 100 ++ kvm/kernel/arch/powerpc/include/asm/kvm_e500.h | 107 +++ kvm/kernel/arch/powerpc/include/asm/kvm_host.h | 232 kvm/kernel/arch/powerpc/include/asm/kvm_para.h | 77 kvm/kernel/arch/powerpc/include/asm/kvm_ppc.h | 137 ++ 7 files changed, 863 insertions(+), 0 deletions(-) create mode 100644 kvm/kernel/arch/powerpc/include/asm/kvm.h create mode 100644 kvm/kernel/arch/powerpc/include/asm/kvm_44x.h create mode 100644 kvm/kernel/arch/powerpc/include/asm/kvm_asm.h create mode 100644 kvm/kernel/arch/powerpc/include/asm/kvm_e500.h create mode 100644 kvm/kernel/arch/powerpc/include/asm/kvm_host.h create mode 100644 kvm/kernel/arch/powerpc/include/asm/kvm_para.h create mode 100644 kvm/kernel/arch/powerpc/include/asm/kvm_ppc.h diff --git a/kvm/kernel/arch/powerpc/include/asm/kvm.h b/kvm/kernel/arch/powerpc/include/asm/kvm.h new file mode 100644 index 000..c4f1ed1 --- /dev/null +++ b/kvm/kernel/arch/powerpc/include/asm/kvm.h @@ -0,0 +1,102 @@ +#ifndef KVM_UNIFDEF_H +#define KVM_UNIFDEF_H + +#ifdef __i386__ +#ifndef CONFIG_X86_32 +#define CONFIG_X86_32 1 +#endif +#endif + +#ifdef __x86_64__ +#ifndef CONFIG_X86_64 +#define CONFIG_X86_64 1 +#endif +#endif + +#if defined(__i386__) || defined (__x86_64__) +#ifndef CONFIG_X86 +#define CONFIG_X86 1 +#endif +#endif + +#ifdef __ia64__ +#ifndef CONFIG_IA64 +#define CONFIG_IA64 1 +#endif +#endif + +#ifdef __PPC__ +#ifndef CONFIG_PPC +#define CONFIG_PPC 1 +#endif +#endif + +#ifdef __s390__ +#ifndef CONFIG_S390 +#define CONFIG_S390 1 +#endif +#endif + +#endif +/* + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License, version 2, as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. + * + * Copyright IBM Corp. 2007 + * + * Authors: Hollis Blanchard holl...@us.ibm.com + */ + +#ifndef __LINUX_KVM_POWERPC_H +#define __LINUX_KVM_POWERPC_H + +#include linux/types.h + +struct kvm_regs { + __u64 pc; + __u64 cr; + __u64 ctr; + __u64 lr; + __u64 xer; + __u64 msr; + __u64 srr0; + __u64 srr1; + __u64 pid; + + __u64 sprg0; + __u64 sprg1; + __u64 sprg2; + __u64 sprg3; + __u64 sprg4; + __u64 sprg5; + __u64 sprg6; + __u64 sprg7; + + __u64 gpr[32]; +}; + +struct kvm_sregs { +}; + +struct kvm_fpu { + __u64 fpr[32]; +}; + +struct kvm_debug_exit_arch { +}; + +/* for KVM_SET_GUEST_DEBUG */ +struct kvm_guest_debug_arch { +}; + +#endif /* __LINUX_KVM_POWERPC_H */ diff --git a/kvm/kernel/arch/powerpc/include/asm/kvm_44x.h b/kvm/kernel/arch/powerpc/include/asm/kvm_44x.h new file mode 100644 index 000..956f252 --- /dev/null +++ b/kvm/kernel/arch/powerpc/include/asm/kvm_44x.h @@ -0,0 +1,108 @@ +#ifndef KVM_UNIFDEF_H +#define KVM_UNIFDEF_H + +#ifdef __i386__ +#ifndef CONFIG_X86_32 +#define CONFIG_X86_32 1 +#endif +#endif + +#ifdef __x86_64__ +#ifndef CONFIG_X86_64 +#define CONFIG_X86_64 1 +#endif +#endif + +#if defined(__i386__) || defined (__x86_64__) +#ifndef CONFIG_X86 +#define CONFIG_X86 1 +#endif +#endif + +#ifdef __ia64__ +#ifndef CONFIG_IA64 +#define CONFIG_IA64 1 +#endif +#endif + +#ifdef __PPC__ +#ifndef CONFIG_PPC +#define CONFIG_PPC 1 +#endif +#endif + +#ifdef __s390__ +#ifndef CONFIG_S390 +#define CONFIG_S390 1 +#endif +#endif + +#endif +/* + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License, version 2, as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. + * + * Copyright IBM Corp. 2008 + * + * Authors: Hollis Blanchard holl...@us.ibm.com + */ + +#ifndef __ASM_44X_H__ +#define __ASM_44X_H__ + +#include linux/kvm_host.h + +#define PPC44x_TLB_SIZE 64 + +/* If the guest is expecting it, this can be as large as we like; we'd just + * need
Re: qemu-kvm.git now live
On Wed, 2009-04-29 at 11:31 +0300, Avi Kivity wrote: Hollis Blanchard wrote: Since PPC is now supported in upstream QEMU, does it really matter if it works in qemu-kvm.git? I was going to take that position too, except Avi asked me specifically if some of the code inside TARGET_I386 ifdefs in qemu-kvm.c works for other architectures. In that case it's sufficient to have the build system use the upstream kvm integration (CONFIG_KVM) rather than the qemu-kvm integration (USE_KVM). OK, I give up... how is this supposed to work? Nobody ever sets CONFIG_KVM or KVM_UPSTREAM, but there are a couple tests for it. Glauber once sent a patch related to that, but I don't see how it helps. For reference, the actual error is about a hundred instances of e.g. /home/hollisb/source/qemu-kvm.git/vl.c:3393: undefined reference to `kvm_allowed' -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/5] [libkvm] Rename config-powerpc to config-ppc
Apparently $(ARCH) now holds the qemu meaning, rather than the KVM meaning. Signed-off-by: Hollis Blanchard holl...@us.ibm.com --- kvm/libkvm/config-powerpc.mak |4 kvm/libkvm/config-ppc.mak |4 2 files changed, 4 insertions(+), 4 deletions(-) delete mode 100644 kvm/libkvm/config-powerpc.mak create mode 100644 kvm/libkvm/config-ppc.mak diff --git a/kvm/libkvm/config-powerpc.mak b/kvm/libkvm/config-powerpc.mak deleted file mode 100644 index 091da37..000 --- a/kvm/libkvm/config-powerpc.mak +++ /dev/null @@ -1,4 +0,0 @@ - -LIBDIR := /lib - -libkvm-$(ARCH)-objs := libkvm-powerpc.o diff --git a/kvm/libkvm/config-ppc.mak b/kvm/libkvm/config-ppc.mak new file mode 100644 index 000..091da37 --- /dev/null +++ b/kvm/libkvm/config-ppc.mak @@ -0,0 +1,4 @@ + +LIBDIR := /lib + +libkvm-$(ARCH)-objs := libkvm-powerpc.o -- 1.6.0.6 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/5] [qemu-kvm] Fix warning when__ia64__ is not defined.
Signed-off-by: Hollis Blanchard holl...@us.ibm.com --- kvm/libkvm/kvm-common.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/kvm/libkvm/kvm-common.h b/kvm/libkvm/kvm-common.h index 96361e8..591fb53 100644 --- a/kvm/libkvm/kvm-common.h +++ b/kvm/libkvm/kvm-common.h @@ -22,7 +22,7 @@ #define KVM_MAX_NUM_MEM_REGIONS 1u #define MAX_VCPUS 64 #define LIBKVM_S390_ORIGIN (0UL) -#elif __ia64__ +#elif defined(__ia64__) #define KVM_MAX_NUM_MEM_REGIONS 32u #define MAX_VCPUS 256 #else -- 1.6.0.6 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/5] [qemu-kvm] Fix missing prototype warning.
As far as I can see, kvm_destroy_memory_region_works() has nothing to do with KVM_CAP_DEVICE_ASSIGNMENT, so move the prototype outside that ifdef block. Signed-off-by: Hollis Blanchard holl...@us.ibm.com --- kvm/libkvm/libkvm.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/kvm/libkvm/libkvm.h b/kvm/libkvm/libkvm.h index ce6f054..c23d37b 100644 --- a/kvm/libkvm/libkvm.h +++ b/kvm/libkvm/libkvm.h @@ -739,6 +739,7 @@ int kvm_assign_irq(kvm_context_t kvm, int kvm_deassign_irq(kvm_context_t kvm, struct kvm_assigned_irq *assigned_irq); #endif +#endif /*! * \brief Determines whether destroying memory regions is allowed @@ -748,7 +749,6 @@ int kvm_deassign_irq(kvm_context_t kvm, * \param kvm Pointer to the current kvm_context */ int kvm_destroy_memory_region_works(kvm_context_t kvm); -#endif #ifdef KVM_CAP_DEVICE_DEASSIGNMENT /*! -- 1.6.0.6 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/5] [qemu-kvm] Use CAP_IRQ_ROUTING in kvm_get_irq_route_gsi()
This fixes a build break when KVM_IOAPIC_NUM_PINS is not defined. Signed-off-by: Hollis Blanchard holl...@us.ibm.com --- kvm/libkvm/libkvm.c |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/kvm/libkvm/libkvm.c b/kvm/libkvm/libkvm.c index 0610e3f..ba0a5d1 100644 --- a/kvm/libkvm/libkvm.c +++ b/kvm/libkvm/libkvm.c @@ -1406,6 +1406,7 @@ int kvm_commit_irq_routes(kvm_context_t kvm) int kvm_get_irq_route_gsi(kvm_context_t kvm) { +#ifdef KVM_CAP_IRQ_ROUTING if (kvm-max_used_gsi = KVM_IOAPIC_NUM_PINS) { if (kvm-max_used_gsi = kvm_get_gsi_count(kvm)) return kvm-max_used_gsi + 1; @@ -1413,6 +1414,9 @@ int kvm_get_irq_route_gsi(kvm_context_t kvm) return -ENOSPC; } else return KVM_IOAPIC_NUM_PINS; +#else + return -ENOSYS; +#endif } #ifdef KVM_CAP_DEVICE_MSIX -- 1.6.0.6 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html