RE: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages
-Original Message- From: Wood Scott-B07421 Sent: Thursday, July 18, 2013 10:48 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Bhushan Bharat- R65777 Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 08:19:03 AM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal RAM and the mapping is set to M bit (coherent, cacheable) otherwise this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v2: some cleanup and added comment - arch/powerpc/kvm/e500_mmu_host.c | 23 ++- 1 files changed, 18 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..02eb973 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,26 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + /* +* RAM is always mappable on e500 systems, so this is identical +* to kvm_is_mmio_pfn(), just without its overhead. +*/ + if (!pfn_valid(pfn)) { Please use page_is_ram(), which is what gets used when setting the WIMG for the host userspace mapping. We want to make sure the two are consistent. + /* Pages not managed by Kernel are treated as I/O, set I + G */ + mas2_attr |= MAS2_I | MAS2_G; #ifdef CONFIG_SMP - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + } else { + /* Kernel managed pages are actually RAM so set M */ + mas2_attr |= MAS2_M; #endif Likewise, we want to make sure this matches the host entry. Unfortunately, this is a bit of a mess already. 64-bit booke appears to always set MAS2_M for TLB0 mappings. Scott, can you please point to the code where MAS2_M is always set for TLB0? -Bharat The initial KERNELBASE mapping on boot uses M_IF_SMP, and the settlbcam() that (IIRC) replaces it uses _PAGE_COHERENT. 32- bit always uses _PAGE_COHERENT, except that initial KERNELBASE mapping. _PAGE_COHERENT appears to be set based on CONFIG_SMP || CONFIG_PPC_STD_MMU (the latter config clears _PAGE_COHERENT in the non-CPU_FTR_NEED_COHERENT case). As for what we actually want to happen, there are cases when we want M to be set for non-SMP. One such case is AMP, where CPUs may be sharing memory even if the Linux instance only runs on one CPU (this is not hypothetical, BTW). It's also possible that we encounter a hardware bug that requires MAS2_M, similar to what some of our non-booke chips require. -Scott -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages
-Original Message- From: Wood Scott-B07421 Sent: Tuesday, July 23, 2013 10:15 PM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; Alexander Graf; kvm-...@vger.kernel.org; kvm@vger.kernel.org Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/22/2013 10:39:16 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: Wood Scott-B07421 Sent: Tuesday, July 23, 2013 12:18 AM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; Alexander Graf; kvm-...@vger.kernel.org; kvm@vger.kernel.org Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/21/2013 11:39:45 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: Wood Scott-B07421 Sent: Thursday, July 18, 2013 11:09 PM To: Alexander Graf Cc: Bhushan Bharat-R65777; kvm-...@vger.kernel.org; kvm@vger.kernel.org; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 12:32:18 PM, Alexander Graf wrote: On 18.07.2013, at 19:17, Scott Wood wrote: On 07/18/2013 08:19:03 AM, Bharat Bhushan wrote: Likewise, we want to make sure this matches the host entry. Unfortunately, this is a bit of a mess already. 64-bit booke appears to always set MAS2_M for TLB0 mappings. The initial KERNELBASE mapping on boot uses M_IF_SMP, and the settlbcam() that (IIRC) replaces it uses _PAGE_COHERENT. 32-bit always uses _PAGE_COHERENT, except that initial KERNELBASE mapping. _PAGE_COHERENT appears to be set based on CONFIG_SMP || CONFIG_PPC_STD_MMU (the latter config clears _PAGE_COHERENT in the non-CPU_FTR_NEED_COHERENT case). As for what we actually want to happen, there are cases when we want M to be set for non-SMP. One such case is AMP, where CPUs may be sharing memory even if the Linux instance only runs on one CPU (this is not hypothetical, BTW). It's also possible that we encounter a hardware bug that requires MAS2_M, similar to what some of our non-booke chips require. How about we always set M then for RAM? M is like I in that bad things happen if you mix them. I am trying to list the invalid mixing of WIMG: 1) I M 2) W I 3) W M (Scott mentioned that he observed issues when mixing these two) 4) is there any other? That's not what I was talking about (and I don't think I mentioned W at all, though it is also potentially problematic). Here is cut paste of your one response: The architecture makes it illegal to mix cacheable and cache-inhibited mappings to the same physical page. Mixing W or M bits is generally bad as well. I've seen it cause machine checks, error interrupts, etc. -- not just corrupting the page in question. So I added not mixing W M. But at that time I missed to understood why mixing M I for same physical address can be issue :). W or M, not W and M. I meant that each one, separately, is in a similar situation as the I bit. None of this is about invalid combinations of attributes on a single TLB entry (though there are architectural restrictions there as well). Ok, I misread again :(. The second part of comment was (looks like you missed so copy pasted below) When we say all RAM (page_is_ram() is true) will be having M bit, then same RAM physical address will not have M mixed with any other, right? Similarly, For IO (which is not RAM), we will set I+G, so I will not be mixed with M. Is not that? -Bharat -Scott -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages
-Original Message- From: Wood Scott-B07421 Sent: Tuesday, July 23, 2013 11:50 PM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; Alexander Graf; kvm-...@vger.kernel.org; kvm@vger.kernel.org Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/23/2013 11:50:35 AM, Bhushan Bharat-R65777 wrote: -Original Message- From: Wood Scott-B07421 Sent: Tuesday, July 23, 2013 10:15 PM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; Alexander Graf; kvm-...@vger.kernel.org; kvm@vger.kernel.org Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/22/2013 10:39:16 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: Wood Scott-B07421 Sent: Tuesday, July 23, 2013 12:18 AM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; Alexander Graf; kvm-...@vger.kernel.org; kvm@vger.kernel.org Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/21/2013 11:39:45 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: Wood Scott-B07421 Sent: Thursday, July 18, 2013 11:09 PM To: Alexander Graf Cc: Bhushan Bharat-R65777; kvm-...@vger.kernel.org; kvm@vger.kernel.org; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 12:32:18 PM, Alexander Graf wrote: On 18.07.2013, at 19:17, Scott Wood wrote: On 07/18/2013 08:19:03 AM, Bharat Bhushan wrote: Likewise, we want to make sure this matches the host entry. Unfortunately, this is a bit of a mess already. 64-bit booke appears to always set MAS2_M for TLB0 mappings. The initial KERNELBASE mapping on boot uses M_IF_SMP, and the settlbcam() that (IIRC) replaces it uses _PAGE_COHERENT. 32-bit always uses _PAGE_COHERENT, except that initial KERNELBASE mapping. _PAGE_COHERENT appears to be set based on CONFIG_SMP || CONFIG_PPC_STD_MMU (the latter config clears _PAGE_COHERENT in the non-CPU_FTR_NEED_COHERENT case). As for what we actually want to happen, there are cases when we want M to be set for non-SMP. One such case is AMP, where CPUs may be sharing memory even if the Linux instance only runs on one CPU (this is not hypothetical, BTW). It's also possible that we encounter a hardware bug that requires MAS2_M, similar to what some of our non-booke chips require. How about we always set M then for RAM? M is like I in that bad things happen if you mix them. I am trying to list the invalid mixing of WIMG: 1) I M 2) W I 3) W M (Scott mentioned that he observed issues when mixing these two) 4) is there any other? That's not what I was talking about (and I don't think I mentioned W at all, though it is also potentially problematic). Here is cut paste of your one response: The architecture makes it illegal to mix cacheable and cache-inhibited mappings to the same physical page. Mixing W or M bits is generally bad as well. I've seen it cause machine checks, error interrupts, etc. -- not just corrupting the page in question. So I added not mixing W M. But at that time I missed to understood why mixing M I for same physical address can be issue :). W or M, not W and M. I meant that each one, separately, is in a similar situation as the I bit. None of this is about invalid combinations of attributes on a single TLB entry (though there are architectural restrictions there as well). Ok, I misread again :(. The second part of comment was (looks like you missed so copy pasted below) When we say all RAM (page_is_ram() is true) will be having M bit, then same RAM physical address will not have M mixed with any other, right? Similarly, For IO (which is not RAM), we will set I+G, so I will not be mixed with M. Is not that? I didn't miss it; it just seemed moot given the earlier confusion. But yes, for now we will set all RAM to M, and all I/O to I+G. Eventually that will change if/when we do vfio for QMan portals or other devices that require cacheable I/O. Agree :) -Bharat -Scott -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages
-Original Message- From: Wood Scott-B07421 Sent: Thursday, July 18, 2013 10:48 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Bhushan Bharat- R65777 Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 08:19:03 AM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal RAM and the mapping is set to M bit (coherent, cacheable) otherwise this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v2: some cleanup and added comment - arch/powerpc/kvm/e500_mmu_host.c | 23 ++- 1 files changed, 18 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..02eb973 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,26 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + /* +* RAM is always mappable on e500 systems, so this is identical +* to kvm_is_mmio_pfn(), just without its overhead. +*/ + if (!pfn_valid(pfn)) { Please use page_is_ram(), which is what gets used when setting the WIMG for the host userspace mapping. We want to make sure the two are consistent. + /* Pages not managed by Kernel are treated as I/O, set I + G */ + mas2_attr |= MAS2_I | MAS2_G; #ifdef CONFIG_SMP - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + } else { + /* Kernel managed pages are actually RAM so set M */ + mas2_attr |= MAS2_M; #endif Likewise, we want to make sure this matches the host entry. Unfortunately, this is a bit of a mess already. 64-bit booke appears to always set MAS2_M for TLB0 mappings. Scott, can you please point to the code where MAS2_M is always set for TLB0? -Bharat The initial KERNELBASE mapping on boot uses M_IF_SMP, and the settlbcam() that (IIRC) replaces it uses _PAGE_COHERENT. 32- bit always uses _PAGE_COHERENT, except that initial KERNELBASE mapping. _PAGE_COHERENT appears to be set based on CONFIG_SMP || CONFIG_PPC_STD_MMU (the latter config clears _PAGE_COHERENT in the non-CPU_FTR_NEED_COHERENT case). As for what we actually want to happen, there are cases when we want M to be set for non-SMP. One such case is AMP, where CPUs may be sharing memory even if the Linux instance only runs on one CPU (this is not hypothetical, BTW). It's also possible that we encounter a hardware bug that requires MAS2_M, similar to what some of our non-booke chips require. -Scott -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages
-Original Message- From: Wood Scott-B07421 Sent: Tuesday, July 23, 2013 10:15 PM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; Alexander Graf; kvm-ppc@vger.kernel.org; k...@vger.kernel.org Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/22/2013 10:39:16 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: Wood Scott-B07421 Sent: Tuesday, July 23, 2013 12:18 AM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; Alexander Graf; kvm-ppc@vger.kernel.org; k...@vger.kernel.org Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/21/2013 11:39:45 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: Wood Scott-B07421 Sent: Thursday, July 18, 2013 11:09 PM To: Alexander Graf Cc: Bhushan Bharat-R65777; kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 12:32:18 PM, Alexander Graf wrote: On 18.07.2013, at 19:17, Scott Wood wrote: On 07/18/2013 08:19:03 AM, Bharat Bhushan wrote: Likewise, we want to make sure this matches the host entry. Unfortunately, this is a bit of a mess already. 64-bit booke appears to always set MAS2_M for TLB0 mappings. The initial KERNELBASE mapping on boot uses M_IF_SMP, and the settlbcam() that (IIRC) replaces it uses _PAGE_COHERENT. 32-bit always uses _PAGE_COHERENT, except that initial KERNELBASE mapping. _PAGE_COHERENT appears to be set based on CONFIG_SMP || CONFIG_PPC_STD_MMU (the latter config clears _PAGE_COHERENT in the non-CPU_FTR_NEED_COHERENT case). As for what we actually want to happen, there are cases when we want M to be set for non-SMP. One such case is AMP, where CPUs may be sharing memory even if the Linux instance only runs on one CPU (this is not hypothetical, BTW). It's also possible that we encounter a hardware bug that requires MAS2_M, similar to what some of our non-booke chips require. How about we always set M then for RAM? M is like I in that bad things happen if you mix them. I am trying to list the invalid mixing of WIMG: 1) I M 2) W I 3) W M (Scott mentioned that he observed issues when mixing these two) 4) is there any other? That's not what I was talking about (and I don't think I mentioned W at all, though it is also potentially problematic). Here is cut paste of your one response: The architecture makes it illegal to mix cacheable and cache-inhibited mappings to the same physical page. Mixing W or M bits is generally bad as well. I've seen it cause machine checks, error interrupts, etc. -- not just corrupting the page in question. So I added not mixing W M. But at that time I missed to understood why mixing M I for same physical address can be issue :). W or M, not W and M. I meant that each one, separately, is in a similar situation as the I bit. None of this is about invalid combinations of attributes on a single TLB entry (though there are architectural restrictions there as well). Ok, I misread again :(. The second part of comment was (looks like you missed so copy pasted below) When we say all RAM (page_is_ram() is true) will be having M bit, then same RAM physical address will not have M mixed with any other, right? Similarly, For IO (which is not RAM), we will set I+G, so I will not be mixed with M. Is not that? -Bharat -Scott -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages
-Original Message- From: Wood Scott-B07421 Sent: Tuesday, July 23, 2013 12:18 AM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; Alexander Graf; kvm-...@vger.kernel.org; kvm@vger.kernel.org Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/21/2013 11:39:45 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: Wood Scott-B07421 Sent: Thursday, July 18, 2013 11:09 PM To: Alexander Graf Cc: Bhushan Bharat-R65777; kvm-...@vger.kernel.org; kvm@vger.kernel.org; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 12:32:18 PM, Alexander Graf wrote: On 18.07.2013, at 19:17, Scott Wood wrote: On 07/18/2013 08:19:03 AM, Bharat Bhushan wrote: Likewise, we want to make sure this matches the host entry. Unfortunately, this is a bit of a mess already. 64-bit booke appears to always set MAS2_M for TLB0 mappings. The initial KERNELBASE mapping on boot uses M_IF_SMP, and the settlbcam() that (IIRC) replaces it uses _PAGE_COHERENT. 32-bit always uses _PAGE_COHERENT, except that initial KERNELBASE mapping. _PAGE_COHERENT appears to be set based on CONFIG_SMP || CONFIG_PPC_STD_MMU (the latter config clears _PAGE_COHERENT in the non-CPU_FTR_NEED_COHERENT case). As for what we actually want to happen, there are cases when we want M to be set for non-SMP. One such case is AMP, where CPUs may be sharing memory even if the Linux instance only runs on one CPU (this is not hypothetical, BTW). It's also possible that we encounter a hardware bug that requires MAS2_M, similar to what some of our non-booke chips require. How about we always set M then for RAM? M is like I in that bad things happen if you mix them. I am trying to list the invalid mixing of WIMG: 1) I M 2) W I 3) W M (Scott mentioned that he observed issues when mixing these two) 4) is there any other? That's not what I was talking about (and I don't think I mentioned W at all, though it is also potentially problematic). Here is cut paste of your one response: The architecture makes it illegal to mix cacheable and cache-inhibited mappings to the same physical page. Mixing W or M bits is generally bad as well. I've seen it cause machine checks, error interrupts, etc. -- not just corrupting the page in question. So I added not mixing W M. But at that time I missed to understood why mixing M I for same physical address can be issue :). I'm talking about mixing I with not-I (on two different virtual addresses pointing to the same physical), M with not-M, etc. When we say all RAM (page_is_ram() is true) will be having M bit, then RAM physical address will not have M mixed with any other, right? Similarly, For IO (which is not RAM), we will set I+G, so I will not be mixed with M. Is not that? -Bharat So we really want to match exactly what the rest of the kernel is doing. How the rest of kernel is doing is a bit complex. IIUC, if we forget about the boot state then this is how kernel set WIMG bits: 1) For Memory always set M if CONFIG_SMP set. - So KVM can do same. M will not be mixed with W and I. G and E are guest control. I don't think this is accurate for 64-bit. And what about the AMP case? -Scott -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages
-Original Message- From: Wood Scott-B07421 Sent: Tuesday, July 23, 2013 12:18 AM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; Alexander Graf; kvm-ppc@vger.kernel.org; k...@vger.kernel.org Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/21/2013 11:39:45 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: Wood Scott-B07421 Sent: Thursday, July 18, 2013 11:09 PM To: Alexander Graf Cc: Bhushan Bharat-R65777; kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 12:32:18 PM, Alexander Graf wrote: On 18.07.2013, at 19:17, Scott Wood wrote: On 07/18/2013 08:19:03 AM, Bharat Bhushan wrote: Likewise, we want to make sure this matches the host entry. Unfortunately, this is a bit of a mess already. 64-bit booke appears to always set MAS2_M for TLB0 mappings. The initial KERNELBASE mapping on boot uses M_IF_SMP, and the settlbcam() that (IIRC) replaces it uses _PAGE_COHERENT. 32-bit always uses _PAGE_COHERENT, except that initial KERNELBASE mapping. _PAGE_COHERENT appears to be set based on CONFIG_SMP || CONFIG_PPC_STD_MMU (the latter config clears _PAGE_COHERENT in the non-CPU_FTR_NEED_COHERENT case). As for what we actually want to happen, there are cases when we want M to be set for non-SMP. One such case is AMP, where CPUs may be sharing memory even if the Linux instance only runs on one CPU (this is not hypothetical, BTW). It's also possible that we encounter a hardware bug that requires MAS2_M, similar to what some of our non-booke chips require. How about we always set M then for RAM? M is like I in that bad things happen if you mix them. I am trying to list the invalid mixing of WIMG: 1) I M 2) W I 3) W M (Scott mentioned that he observed issues when mixing these two) 4) is there any other? That's not what I was talking about (and I don't think I mentioned W at all, though it is also potentially problematic). Here is cut paste of your one response: The architecture makes it illegal to mix cacheable and cache-inhibited mappings to the same physical page. Mixing W or M bits is generally bad as well. I've seen it cause machine checks, error interrupts, etc. -- not just corrupting the page in question. So I added not mixing W M. But at that time I missed to understood why mixing M I for same physical address can be issue :). I'm talking about mixing I with not-I (on two different virtual addresses pointing to the same physical), M with not-M, etc. When we say all RAM (page_is_ram() is true) will be having M bit, then RAM physical address will not have M mixed with any other, right? Similarly, For IO (which is not RAM), we will set I+G, so I will not be mixed with M. Is not that? -Bharat So we really want to match exactly what the rest of the kernel is doing. How the rest of kernel is doing is a bit complex. IIUC, if we forget about the boot state then this is how kernel set WIMG bits: 1) For Memory always set M if CONFIG_SMP set. - So KVM can do same. M will not be mixed with W and I. G and E are guest control. I don't think this is accurate for 64-bit. And what about the AMP case? -Scott -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages
-Original Message- From: Wood Scott-B07421 Sent: Thursday, July 18, 2013 11:09 PM To: Alexander Graf Cc: Bhushan Bharat-R65777; kvm-...@vger.kernel.org; kvm@vger.kernel.org; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 12:32:18 PM, Alexander Graf wrote: On 18.07.2013, at 19:17, Scott Wood wrote: On 07/18/2013 08:19:03 AM, Bharat Bhushan wrote: Likewise, we want to make sure this matches the host entry. Unfortunately, this is a bit of a mess already. 64-bit booke appears to always set MAS2_M for TLB0 mappings. The initial KERNELBASE mapping on boot uses M_IF_SMP, and the settlbcam() that (IIRC) replaces it uses _PAGE_COHERENT. 32-bit always uses _PAGE_COHERENT, except that initial KERNELBASE mapping. _PAGE_COHERENT appears to be set based on CONFIG_SMP || CONFIG_PPC_STD_MMU (the latter config clears _PAGE_COHERENT in the non-CPU_FTR_NEED_COHERENT case). As for what we actually want to happen, there are cases when we want M to be set for non-SMP. One such case is AMP, where CPUs may be sharing memory even if the Linux instance only runs on one CPU (this is not hypothetical, BTW). It's also possible that we encounter a hardware bug that requires MAS2_M, similar to what some of our non-booke chips require. How about we always set M then for RAM? M is like I in that bad things happen if you mix them. I am trying to list the invalid mixing of WIMG: 1) I M 2) W I 3) W M (Scott mentioned that he observed issues when mixing these two) 4) is there any other? So it mean it is safe to let guest control G and E. So we really want to match exactly what the rest of the kernel is doing. How the rest of kernel is doing is a bit complex. IIUC, if we forget about the boot state then this is how kernel set WIMG bits: 1) For Memory always set M if CONFIG_SMP set. - So KVM can do same. M will not be mixed with W and I. G and E are guest control. 2) For I/O , drivers can pass flags to set M or I + G. - For KVM; if not memory then it is I/O. For now we can always set I + G. - Later we can design some mechanism in VFIO interface to let KVM somehow know whether to set M or I+G. -Bharat Plus, the performance penalty on some single-core chips can be pretty bad. -Scott -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages
-Original Message- From: Wood Scott-B07421 Sent: Thursday, July 18, 2013 11:09 PM To: Alexander Graf Cc: Bhushan Bharat-R65777; kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 12:32:18 PM, Alexander Graf wrote: On 18.07.2013, at 19:17, Scott Wood wrote: On 07/18/2013 08:19:03 AM, Bharat Bhushan wrote: Likewise, we want to make sure this matches the host entry. Unfortunately, this is a bit of a mess already. 64-bit booke appears to always set MAS2_M for TLB0 mappings. The initial KERNELBASE mapping on boot uses M_IF_SMP, and the settlbcam() that (IIRC) replaces it uses _PAGE_COHERENT. 32-bit always uses _PAGE_COHERENT, except that initial KERNELBASE mapping. _PAGE_COHERENT appears to be set based on CONFIG_SMP || CONFIG_PPC_STD_MMU (the latter config clears _PAGE_COHERENT in the non-CPU_FTR_NEED_COHERENT case). As for what we actually want to happen, there are cases when we want M to be set for non-SMP. One such case is AMP, where CPUs may be sharing memory even if the Linux instance only runs on one CPU (this is not hypothetical, BTW). It's also possible that we encounter a hardware bug that requires MAS2_M, similar to what some of our non-booke chips require. How about we always set M then for RAM? M is like I in that bad things happen if you mix them. I am trying to list the invalid mixing of WIMG: 1) I M 2) W I 3) W M (Scott mentioned that he observed issues when mixing these two) 4) is there any other? So it mean it is safe to let guest control G and E. So we really want to match exactly what the rest of the kernel is doing. How the rest of kernel is doing is a bit complex. IIUC, if we forget about the boot state then this is how kernel set WIMG bits: 1) For Memory always set M if CONFIG_SMP set. - So KVM can do same. M will not be mixed with W and I. G and E are guest control. 2) For I/O , drivers can pass flags to set M or I + G. - For KVM; if not memory then it is I/O. For now we can always set I + G. - Later we can design some mechanism in VFIO interface to let KVM somehow know whether to set M or I+G. -Bharat Plus, the performance penalty on some single-core chips can be pretty bad. -Scott -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
-Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. -Bharat Tiejun + mas2_attr |= MAS2_I | MAS2_G; + } else { #ifdef CONFIG_SMP - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + mas2_attr |= MAS2_M; #endif + } + return mas2_attr; } /* @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe( /* Force IPROT=0 for all guest mappings. */ stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID; stlbe-mas2 = (gvaddr MAS2_EPN) | - e500_shadow_mas2_attrib(gtlbe-mas2, pr); + e500_shadow_mas2_attrib(gtlbe-mas2, pfn); stlbe-mas7_3 = ((u64)pfn PAGE_SHIFT) | e500_shadow_mas3_attrib(gtlbe-mas7_3, pr); N�r��yb�X��ǧv�^�){.n�+h����ܨ}���Ơz�j:+v���zZ+��+zf���h���~i���z��w���?��)ߢf
RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
-Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen” Sent: Thursday, July 18, 2013 1:01 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Do you know what are those some cases and how checking PageReserved helps in those cases? -Bharat Tiejun -Bharat Tiejun + mas2_attr |= MAS2_I | MAS2_G; + } else { #ifdef CONFIG_SMP - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + mas2_attr |= MAS2_M; #endif + } + return mas2_attr; } /* @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe( /* Force IPROT=0 for all guest mappings. */ stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID; stlbe-mas2 = (gvaddr MAS2_EPN) | - e500_shadow_mas2_attrib(gtlbe-mas2, pr); + e500_shadow_mas2_attrib(gtlbe-mas2, pfn); stlbe-mas7_3 = ((u64)pfn PAGE_SHIFT) | e500_shadow_mas3_attrib(gtlbe-mas7_3, pr); -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html N�r��yb�X��ǧv�^�){.n�+h����ܨ}���Ơz�j:+v���zZ+��+zf���h���~i���z��w���?��)ߢf
RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
-Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 1:52 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen” Sent: Thursday, July 18, 2013 1:01 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Do you know what are those some cases and how checking PageReserved helps in those cases? No, myself didn't see these actual cases in qemu,too. But this should be chronically persistent as I understand ;-) Then I will wait till someone educate me :) -Bharat Tiejun -Bharat Tiejun -Bharat Tiejun + mas2_attr |= MAS2_I | MAS2_G; + } else { #ifdef CONFIG_SMP - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + mas2_attr |= MAS2_M; #endif + } + return mas2_attr; } /* @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe( /* Force IPROT=0 for all guest mappings. */ stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID; stlbe-mas2 = (gvaddr MAS2_EPN) | - e500_shadow_mas2_attrib(gtlbe-mas2, pr); + e500_shadow_mas2_attrib(gtlbe-mas2, pfn); stlbe-mas7_3 = ((u64)pfn PAGE_SHIFT) | e500_shadow_mas3_attrib(gtlbe-mas7_3, pr); -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
-Original Message- From: Bhushan Bharat-R65777 Sent: Thursday, July 18, 2013 1:53 PM To: '“tiejun.chen”' Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 1:52 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen” Sent: Thursday, July 18, 2013 1:01 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Do you know what are those some cases and how checking PageReserved helps in those cases? No, myself didn't see these actual cases in qemu,too. But this should be chronically persistent as I understand ;-) Then I will wait till someone educate me :) The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want to call this for all tlbwe operation unless it is necessary. -Bharat + mas2_attr |= MAS2_I | MAS2_G; + } else { #ifdef CONFIG_SMP - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + mas2_attr |= MAS2_M; #endif + } + return mas2_attr; } /* @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe( /* Force IPROT=0 for all guest mappings. */ stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID; stlbe-mas2 = (gvaddr MAS2_EPN) | - e500_shadow_mas2_attrib(gtlbe-mas2, pr); + e500_shadow_mas2_attrib(gtlbe-mas2, pfn); stlbe-mas7_3 = ((u64)pfn PAGE_SHIFT) | e500_shadow_mas3_attrib(gtlbe-mas7_3, pr); -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html N�r��yb�X��ǧv�^�){.n�+h����ܨ}���Ơz�j:+v���zZ+��+zf���h���~i���z��w���?��)ߢf
RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
-Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Thursday, July 18, 2013 3:19 PM To: Bhushan Bharat-R65777 Cc: “tiejun.chen”; kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 18.07.2013, at 10:25, Bhushan Bharat-R65777 wrote: -Original Message- From: Bhushan Bharat-R65777 Sent: Thursday, July 18, 2013 1:53 PM To: '“tiejun.chen”' Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 1:52 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen” Sent: Thursday, July 18, 2013 1:01 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { +u32 mas2_attr; + +mas2_attr = mas2 MAS2_ATTRIB_MASK; + +if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Do you know what are those some cases and how checking PageReserved helps in those cases? No, myself didn't see these actual cases in qemu,too. But this should be chronically persistent as I understand ;-) Then I will wait till someone educate me :) The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want to call this for all tlbwe operation unless it is necessary. It certainly does more than we need and potentially slows down the fast path (RAM mapping). The only thing it does on top of if (pfn_valid()) is to check for pages that are declared reserved on the host. This happens in 2 cases: 1) Non cache coherent DMA 2) Memory hot remove The non coherent DMA case would be interesting, as with the mechanism as it is in place in Linux today, we could potentially break normal guest operation if we don't take it into account. However, it's Kconfig guarded by: depends on 4xx || 8xx || E200 || PPC_MPC512x || GAMECUBE_COMMON default n if PPC_47x default y so we never hit it with any core we care about ;). Memory hot remove does not exist on e500 FWIW, so we don't have to worry about that one either. Which means I think it's fine to slim
RE: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2
-Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Thursday, July 18, 2013 8:18 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2 This needs a description. Why shouldn't we ignore E? What I understood is that there is no reason to stop guest setting E, so allow him. -Bharat Alex On 18.07.2013, at 15:19, Bharat Bhushan wrote: Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v2: - No change arch/powerpc/kvm/e500.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h index c2e5e98..277cb18 100644 --- a/arch/powerpc/kvm/e500.h +++ b/arch/powerpc/kvm/e500.h @@ -117,7 +117,7 @@ static inline struct kvmppc_vcpu_e500 *to_e500(struct kvm_vcpu *vcpu) #define E500_TLB_USER_PERM_MASK (MAS3_UX|MAS3_UR|MAS3_UW) #define E500_TLB_SUPER_PERM_MASK (MAS3_SX|MAS3_SR|MAS3_SW) #define MAS2_ATTRIB_MASK \ - (MAS2_X0 | MAS2_X1) + (MAS2_X0 | MAS2_X1 | MAS2_E) #define MAS3_ATTRIB_MASK \ (MAS3_U0 | MAS3_U1 | MAS3_U2 | MAS3_U3 \ | E500_TLB_USER_PERM_MASK | E500_TLB_SUPER_PERM_MASK) -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages
-Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Thursday, July 18, 2013 8:23 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages On 18.07.2013, at 15:19, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal RAM and the mapping is set to M bit (coherent, cacheable) otherwise this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v2: some cleanup and added comment - arch/powerpc/kvm/e500_mmu_host.c | 23 ++- 1 files changed, 18 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..02eb973 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,26 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + /* +* RAM is always mappable on e500 systems, so this is identical +* to kvm_is_mmio_pfn(), just without its overhead. +*/ + if (!pfn_valid(pfn)) { + /* Pages not managed by Kernel are treated as I/O, set I + G */ Please also document the intermediate thought that I/O should be mapped non- cached. I did not get what you mean to document? + mas2_attr |= MAS2_I | MAS2_G; #ifdef CONFIG_SMP Please separate the SMP case out of the branch. Really :) this was looking simple to me. - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + } else { + /* Kernel managed pages are actually RAM so set M */ This comment doesn't tell me why M can be set ;). RAM in SMP, so setting coherent, is not that obvious? -Bharat Alex + mas2_attr |= MAS2_M; #endif + } + return mas2_attr; } /* @@ -313,7 +326,7 @@ static void kvmppc_e500_setup_stlbe( /* Force IPROT=0 for all guest mappings. */ stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID; stlbe-mas2 = (gvaddr MAS2_EPN) | - e500_shadow_mas2_attrib(gtlbe-mas2, pr); + e500_shadow_mas2_attrib(gtlbe-mas2, pfn); stlbe-mas7_3 = ((u64)pfn PAGE_SHIFT) | e500_shadow_mas3_attrib(gtlbe-mas7_3, pr); -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Thursday, July 18, 2013 8:50 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421 Subject: Re: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2 On 18.07.2013, at 17:12, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Thursday, July 18, 2013 8:18 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2 This needs a description. Why shouldn't we ignore E? What I understood is that there is no reason to stop guest setting E, so allow him. Please add that to the patch description. Also explain what the bit means. Ok :) -Bharat Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
-Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. -Bharat Tiejun + mas2_attr |= MAS2_I | MAS2_G; + } else { #ifdef CONFIG_SMP - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + mas2_attr |= MAS2_M; #endif + } + return mas2_attr; } /* @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe( /* Force IPROT=0 for all guest mappings. */ stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID; stlbe-mas2 = (gvaddr MAS2_EPN) | - e500_shadow_mas2_attrib(gtlbe-mas2, pr); + e500_shadow_mas2_attrib(gtlbe-mas2, pfn); stlbe-mas7_3 = ((u64)pfn PAGE_SHIFT) | e500_shadow_mas3_attrib(gtlbe-mas7_3, pr); N�r��yb�X��ǧv�^�){.n�+jir)w*jg����ݢj/���z�ޖ��2�ޙ�)ߡ�a�����G���h��j:+v���w��٥
RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
-Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen” Sent: Thursday, July 18, 2013 1:01 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Do you know what are those some cases and how checking PageReserved helps in those cases? -Bharat Tiejun -Bharat Tiejun + mas2_attr |= MAS2_I | MAS2_G; + } else { #ifdef CONFIG_SMP - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + mas2_attr |= MAS2_M; #endif + } + return mas2_attr; } /* @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe( /* Force IPROT=0 for all guest mappings. */ stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID; stlbe-mas2 = (gvaddr MAS2_EPN) | - e500_shadow_mas2_attrib(gtlbe-mas2, pr); + e500_shadow_mas2_attrib(gtlbe-mas2, pfn); stlbe-mas7_3 = ((u64)pfn PAGE_SHIFT) | e500_shadow_mas3_attrib(gtlbe-mas7_3, pr); -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html N�r��yb�X��ǧv�^�){.n�+jir)w*jg����ݢj/���z�ޖ��2�ޙ�)ߡ�a�����G���h��j:+v���w��٥
RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
-Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 1:52 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen” Sent: Thursday, July 18, 2013 1:01 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Do you know what are those some cases and how checking PageReserved helps in those cases? No, myself didn't see these actual cases in qemu,too. But this should be chronically persistent as I understand ;-) Then I will wait till someone educate me :) -Bharat Tiejun -Bharat Tiejun -Bharat Tiejun + mas2_attr |= MAS2_I | MAS2_G; + } else { #ifdef CONFIG_SMP - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + mas2_attr |= MAS2_M; #endif + } + return mas2_attr; } /* @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe( /* Force IPROT=0 for all guest mappings. */ stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID; stlbe-mas2 = (gvaddr MAS2_EPN) | - e500_shadow_mas2_attrib(gtlbe-mas2, pr); + e500_shadow_mas2_attrib(gtlbe-mas2, pfn); stlbe-mas7_3 = ((u64)pfn PAGE_SHIFT) | e500_shadow_mas3_attrib(gtlbe-mas7_3, pr); -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
-Original Message- From: Bhushan Bharat-R65777 Sent: Thursday, July 18, 2013 1:53 PM To: '“tiejun.chen”' Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 1:52 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen” Sent: Thursday, July 18, 2013 1:01 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Do you know what are those some cases and how checking PageReserved helps in those cases? No, myself didn't see these actual cases in qemu,too. But this should be chronically persistent as I understand ;-) Then I will wait till someone educate me :) The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want to call this for all tlbwe operation unless it is necessary. -Bharat + mas2_attr |= MAS2_I | MAS2_G; + } else { #ifdef CONFIG_SMP - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + mas2_attr |= MAS2_M; #endif + } + return mas2_attr; } /* @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe( /* Force IPROT=0 for all guest mappings. */ stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID; stlbe-mas2 = (gvaddr MAS2_EPN) | - e500_shadow_mas2_attrib(gtlbe-mas2, pr); + e500_shadow_mas2_attrib(gtlbe-mas2, pfn); stlbe-mas7_3 = ((u64)pfn PAGE_SHIFT) | e500_shadow_mas3_attrib(gtlbe-mas7_3, pr); -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html N�r��yb�X��ǧv�^�){.n�+jir)w*jg����ݢj/���z�ޖ��2�ޙ�)ߡ�a�����G���h��j:+v���w��٥
RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
-Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Thursday, July 18, 2013 3:19 PM To: Bhushan Bharat-R65777 Cc: “tiejun.chen”; kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 18.07.2013, at 10:25, Bhushan Bharat-R65777 wrote: -Original Message- From: Bhushan Bharat-R65777 Sent: Thursday, July 18, 2013 1:53 PM To: '“tiejun.chen”' Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 1:52 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen” Sent: Thursday, July 18, 2013 1:01 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { +u32 mas2_attr; + +mas2_attr = mas2 MAS2_ATTRIB_MASK; + +if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Do you know what are those some cases and how checking PageReserved helps in those cases? No, myself didn't see these actual cases in qemu,too. But this should be chronically persistent as I understand ;-) Then I will wait till someone educate me :) The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want to call this for all tlbwe operation unless it is necessary. It certainly does more than we need and potentially slows down the fast path (RAM mapping). The only thing it does on top of if (pfn_valid()) is to check for pages that are declared reserved on the host. This happens in 2 cases: 1) Non cache coherent DMA 2) Memory hot remove The non coherent DMA case would be interesting, as with the mechanism as it is in place in Linux today, we could potentially break normal guest operation if we don't take it into account. However, it's Kconfig guarded by: depends on 4xx || 8xx || E200 || PPC_MPC512x || GAMECUBE_COMMON default n if PPC_47x default y so we never hit it with any core we care about ;). Memory hot remove does not exist on e500 FWIW, so we don't have to worry about that one either. Which means I think it's fine
RE: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Thursday, July 18, 2013 8:50 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421 Subject: Re: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2 On 18.07.2013, at 17:12, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Thursday, July 18, 2013 8:18 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2 This needs a description. Why shouldn't we ignore E? What I understood is that there is no reason to stop guest setting E, so allow him. Please add that to the patch description. Also explain what the bit means. Ok :) -Bharat Alex -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 3/5] booke: define reset and shutdown hcalls
On 17.07.2013, at 13:00, Gleb Natapov wrote: On Tue, Jul 16, 2013 at 06:04:34PM -0500, Scott Wood wrote: On 07/16/2013 01:35:55 AM, Gleb Natapov wrote: On Mon, Jul 15, 2013 at 01:17:33PM -0500, Scott Wood wrote: On 07/15/2013 06:30:20 AM, Gleb Natapov wrote: There is no much sense to share hypercalls between architectures. There is zero probability x86 will implement those for instance This is similar to the question of whether to keep device API enumerations per-architecture... It costs very little to keep it in a common place, and it's hard to go back in the other direction if we later realize there are things that should be shared. This is different from device API since with device API all arches have to create/destroy devices, so it make sense to put device lifecycle management into the common code, and device API has single entry point to the code - device fd ioctl - where it makes sense to handle common tasks, if any, and despatch others to specific device implementation. This is totally unlike hypercalls which are, by definition, very architecture specific (the way they are triggered, the way parameter are passed from guest to host, what hypercalls arch needs...). The ABI is architecture specific. The API doesn't need to be, any more than it does with syscalls (I consider the architecture-specific definition of syscall numbers and similar constants in Linux to be unfortunate, especially for tools such as strace or QEMU's linux-user emulation). Unlike syscalls different arches have very different ideas what hypercalls they need to implement, so while with unified syscall space I can see how it may benefit (very) small number of tools, I do not see what advantage it will give us. The disadvantage is one more global name space to manage. Keeping it in a common place also makes it more visible to people looking to add new hcalls, which could cut down on reinventing the wheel. I do not want other arches to start using hypercalls in the way powerpc started to use them: separate device io space, so it is better to hide this as far away from common code as possible :) But on a more serious note hypercalls should be a last resort and added only when no other possibility exists, so people should not look what hcalls others implemented, so they can add them to their favorite arch, but they should have a problem at hand that they cannot solve without hcall, but at this point they will have pretty good idea what this hcall should do. Why are hcalls such a bad thing? Because they often used to do non architectural things making OSes behave different from how they runs on real HW and real HW is what OSes are designed and tested for. Example: there once was a KVM (XEN have/had similar one) hypercall to accelerate MMU operation. One thing it allowed is to to flush tlb without doing IPI if vcpu is not running. Later optimization was added to Linux MMU code that _relies_ on those IPIs for synchronisation. Good that at that point those hypercalls were already deprecated on KVM (IIRC XEN was broke for some time in that regard). Which brings me to another point: they often get obsoleted by code improvement and HW advancement (happened to aforementioned MMU hypercalls), but they hard to deprecate if hypervisor supports live migration, without live migration it is less of a problem. Next point is that people often try to use them instead of emulate PV or real device just because they think it is easier, but it is often not so. Example: pvpanic device was initially proposed as hypercall, so lets say we would implement it as such. It would have been KVM specific, implementation would touch core guest KVM code and would have been Linux guest specific. Instead it was implemented as platform device with very small platform driver confined in drivers/ directory, immediately usable by XEN and QEMU tcg in addition This is actually a very good point. How do we support reboot and shutdown for TCG guests? We surely don't want to expose TCG as KVM hypervisor. Hmm...so are you proposing that we abandon the current approach, and switch to a device-based mechanism for reboot/shutdown? Reading Gleb's email it sounds like the more future proof approach, yes. I'm not quite sure yet where we should plug this though. What do you mean...where the paravirt device would go in the physical address map?? Right. Either we - let the guest decide (PCI) - let QEMU decide, but potentially break the SoC layout (SysBus) - let QEMU decide, but only for the virt machine so that we don't break anyone (PlatBus) Can you please elaborate above two points ? -Bharat Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 3/5] booke: define reset and shutdown hcalls
On 17.07.2013, at 13:00, Gleb Natapov wrote: On Tue, Jul 16, 2013 at 06:04:34PM -0500, Scott Wood wrote: On 07/16/2013 01:35:55 AM, Gleb Natapov wrote: On Mon, Jul 15, 2013 at 01:17:33PM -0500, Scott Wood wrote: On 07/15/2013 06:30:20 AM, Gleb Natapov wrote: There is no much sense to share hypercalls between architectures. There is zero probability x86 will implement those for instance This is similar to the question of whether to keep device API enumerations per-architecture... It costs very little to keep it in a common place, and it's hard to go back in the other direction if we later realize there are things that should be shared. This is different from device API since with device API all arches have to create/destroy devices, so it make sense to put device lifecycle management into the common code, and device API has single entry point to the code - device fd ioctl - where it makes sense to handle common tasks, if any, and despatch others to specific device implementation. This is totally unlike hypercalls which are, by definition, very architecture specific (the way they are triggered, the way parameter are passed from guest to host, what hypercalls arch needs...). The ABI is architecture specific. The API doesn't need to be, any more than it does with syscalls (I consider the architecture-specific definition of syscall numbers and similar constants in Linux to be unfortunate, especially for tools such as strace or QEMU's linux-user emulation). Unlike syscalls different arches have very different ideas what hypercalls they need to implement, so while with unified syscall space I can see how it may benefit (very) small number of tools, I do not see what advantage it will give us. The disadvantage is one more global name space to manage. Keeping it in a common place also makes it more visible to people looking to add new hcalls, which could cut down on reinventing the wheel. I do not want other arches to start using hypercalls in the way powerpc started to use them: separate device io space, so it is better to hide this as far away from common code as possible :) But on a more serious note hypercalls should be a last resort and added only when no other possibility exists, so people should not look what hcalls others implemented, so they can add them to their favorite arch, but they should have a problem at hand that they cannot solve without hcall, but at this point they will have pretty good idea what this hcall should do. Why are hcalls such a bad thing? Because they often used to do non architectural things making OSes behave different from how they runs on real HW and real HW is what OSes are designed and tested for. Example: there once was a KVM (XEN have/had similar one) hypercall to accelerate MMU operation. One thing it allowed is to to flush tlb without doing IPI if vcpu is not running. Later optimization was added to Linux MMU code that _relies_ on those IPIs for synchronisation. Good that at that point those hypercalls were already deprecated on KVM (IIRC XEN was broke for some time in that regard). Which brings me to another point: they often get obsoleted by code improvement and HW advancement (happened to aforementioned MMU hypercalls), but they hard to deprecate if hypervisor supports live migration, without live migration it is less of a problem. Next point is that people often try to use them instead of emulate PV or real device just because they think it is easier, but it is often not so. Example: pvpanic device was initially proposed as hypercall, so lets say we would implement it as such. It would have been KVM specific, implementation would touch core guest KVM code and would have been Linux guest specific. Instead it was implemented as platform device with very small platform driver confined in drivers/ directory, immediately usable by XEN and QEMU tcg in addition This is actually a very good point. How do we support reboot and shutdown for TCG guests? We surely don't want to expose TCG as KVM hypervisor. Hmm...so are you proposing that we abandon the current approach, and switch to a device-based mechanism for reboot/shutdown? Reading Gleb's email it sounds like the more future proof approach, yes. I'm not quite sure yet where we should plug this though. What do you mean...where the paravirt device would go in the physical address map?? Right. Either we - let the guest decide (PCI) - let QEMU decide, but potentially break the SoC layout (SysBus) - let QEMU decide, but only for the virt machine so that we don't break anyone (PlatBus) Can you please elaborate above two points ? If we emulate an MPC8544DS, we need to emulate an MPC8544DS. Any time we diverge from the layout of the original chip, things can break. However, for our PV machine (-M ppce500 /
RE: [PATCH 3/5] booke: define reset and shutdown hcalls
On 17.07.2013, at 13:00, Gleb Natapov wrote: On Tue, Jul 16, 2013 at 06:04:34PM -0500, Scott Wood wrote: On 07/16/2013 01:35:55 AM, Gleb Natapov wrote: On Mon, Jul 15, 2013 at 01:17:33PM -0500, Scott Wood wrote: On 07/15/2013 06:30:20 AM, Gleb Natapov wrote: There is no much sense to share hypercalls between architectures. There is zero probability x86 will implement those for instance This is similar to the question of whether to keep device API enumerations per-architecture... It costs very little to keep it in a common place, and it's hard to go back in the other direction if we later realize there are things that should be shared. This is different from device API since with device API all arches have to create/destroy devices, so it make sense to put device lifecycle management into the common code, and device API has single entry point to the code - device fd ioctl - where it makes sense to handle common tasks, if any, and despatch others to specific device implementation. This is totally unlike hypercalls which are, by definition, very architecture specific (the way they are triggered, the way parameter are passed from guest to host, what hypercalls arch needs...). The ABI is architecture specific. The API doesn't need to be, any more than it does with syscalls (I consider the architecture-specific definition of syscall numbers and similar constants in Linux to be unfortunate, especially for tools such as strace or QEMU's linux-user emulation). Unlike syscalls different arches have very different ideas what hypercalls they need to implement, so while with unified syscall space I can see how it may benefit (very) small number of tools, I do not see what advantage it will give us. The disadvantage is one more global name space to manage. Keeping it in a common place also makes it more visible to people looking to add new hcalls, which could cut down on reinventing the wheel. I do not want other arches to start using hypercalls in the way powerpc started to use them: separate device io space, so it is better to hide this as far away from common code as possible :) But on a more serious note hypercalls should be a last resort and added only when no other possibility exists, so people should not look what hcalls others implemented, so they can add them to their favorite arch, but they should have a problem at hand that they cannot solve without hcall, but at this point they will have pretty good idea what this hcall should do. Why are hcalls such a bad thing? Because they often used to do non architectural things making OSes behave different from how they runs on real HW and real HW is what OSes are designed and tested for. Example: there once was a KVM (XEN have/had similar one) hypercall to accelerate MMU operation. One thing it allowed is to to flush tlb without doing IPI if vcpu is not running. Later optimization was added to Linux MMU code that _relies_ on those IPIs for synchronisation. Good that at that point those hypercalls were already deprecated on KVM (IIRC XEN was broke for some time in that regard). Which brings me to another point: they often get obsoleted by code improvement and HW advancement (happened to aforementioned MMU hypercalls), but they hard to deprecate if hypervisor supports live migration, without live migration it is less of a problem. Next point is that people often try to use them instead of emulate PV or real device just because they think it is easier, but it is often not so. Example: pvpanic device was initially proposed as hypercall, so lets say we would implement it as such. It would have been KVM specific, implementation would touch core guest KVM code and would have been Linux guest specific. Instead it was implemented as platform device with very small platform driver confined in drivers/ directory, immediately usable by XEN and QEMU tcg in addition This is actually a very good point. How do we support reboot and shutdown for TCG guests? We surely don't want to expose TCG as KVM hypervisor. Hmm...so are you proposing that we abandon the current approach, and switch to a device-based mechanism for reboot/shutdown? Reading Gleb's email it sounds like the more future proof approach, yes. I'm not quite sure yet where we should plug this though. What do you mean...where the paravirt device would go in the physical address map?? Right. Either we - let the guest decide (PCI) - let QEMU decide, but potentially break the SoC layout (SysBus) - let QEMU decide, but only for the virt machine so that we don't break anyone (PlatBus) Can you please elaborate above two points ? If we emulate an MPC8544DS, we need to emulate an MPC8544DS. Any time we diverge from the layout of the original chip, things can break. However, for our PV machine (-M ppce500 /
RE: [PATCH 1/5] powerpc: define ePAPR hcall exit interface
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, July 15, 2013 4:51 PM To: Bhushan Bharat-R65777 Cc: kvm@vger.kernel.org; kvm-...@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248; Bhushan Bharat-R65777 Subject: Re: [PATCH 1/5] powerpc: define ePAPR hcall exit interface On 15.07.2013, at 13:11, Bharat Bhushan wrote: This patch defines the ePAPR hcall exit interface to guest user space. The subject line is misleading. This is a kvm patch. Same applies for most other patches. Ok, will make this kvm: powerpc: define ePAPR hcall exit interface Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- Documentation/virtual/kvm/api.txt | 20 include/uapi/linux/kvm.h |7 +++ 2 files changed, 27 insertions(+), 0 deletions(-) diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 66dd2aa..054f2f4 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -2597,6 +2597,26 @@ The possible hypercalls are defined in the Power Architecture Platform Requirements (PAPR) document available from www.power.org (free developer registration required to access it). + /* KVM_EXIT_EPAPR_HCALL */ + struct { + __u64 nr; + __u64 ret; + __u64 args[8]; + } epapr_hcall; + +This is used on PowerPC platforms that support ePAPR hcalls. +It occurs when a guest does a hypercall (as defined in the ePAPR 1.1) +and the hcall is not handled by the kernel. + +The 'nr' field contains the hypercall number (from the guest R11), +and 'args' contains the arguments (from the guest R3 - R10). +Userspace should put the return code in 'ret' and any extra returned +values in args[]. If the VM is not in 64-bit mode KVM zeros the +upper half of each field in the struct. + +As per the ePAPR hcall ABI, the return value is returned to the guest +in R3 and output return values in R4 - R10. + /* KVM_EXIT_S390_TSCH */ struct { __u16 subchannel_id; diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index acccd08..01ee50e 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -171,6 +171,7 @@ struct kvm_pit_config { #define KVM_EXIT_WATCHDOG 21 #define KVM_EXIT_S390_TSCH22 #define KVM_EXIT_EPR 23 +#define KVM_EXIT_EPAPR_HCALL 24 /* For KVM_EXIT_INTERNAL_ERROR */ /* Emulate instruction failed. */ @@ -288,6 +289,12 @@ struct kvm_run { __u64 ret; __u64 args[9]; } papr_hcall; + /* KVM_EXIT_EPAPR_HCALL */ + struct { + __u64 nr; + __u64 ret; + __u64 args[8]; + } epapr_hcall; This should be at the end of the union. Ok. -Bharat Alex /* KVM_EXIT_S390_TSCH */ struct { __u16 subchannel_id; -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, July 15, 2013 5:02 PM To: Bhushan Bharat-R65777 Cc: kvm@vger.kernel.org; kvm-...@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm On 15.07.2013, at 13:11, Bharat Bhushan wrote: Exit to guest user space if kvm does not implement the hcall. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/booke.c | 47 +-- arch/powerpc/kvm/powerpc.c |1 + include/uapi/linux/kvm.h |1 + 3 files changed, 42 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 17722d8..c8b41b4 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -1005,9 +1005,25 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, break; #ifdef CONFIG_KVM_BOOKE_HV - case BOOKE_INTERRUPT_HV_SYSCALL: + case BOOKE_INTERRUPT_HV_SYSCALL: { This is getting large. Please extract hcall handling into its own function. Maybe you can merge the HV and non-HV case then too. + int i; if (!(vcpu-arch.shared-msr MSR_PR)) { - kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu)); + r = kvmppc_kvm_pv(vcpu); + if (r != EV_UNIMPLEMENTED) { + /* except unimplemented return to guest */ + kvmppc_set_gpr(vcpu, 3, r); + kvmppc_account_exit(vcpu, SYSCALL_EXITS); + r = RESUME_GUEST; + break; + } + /* Exit to userspace for unimplemented hcalls in kvm */ + run-epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11); + run-epapr_hcall.ret = 0; + for (i = 0; i 8; i++) + run-epapr_hcall.args[i] = kvmppc_get_gpr(vcpu, 3 + i); + vcpu-arch.hcall_needed = 1; + kvmppc_account_exit(vcpu, SYSCALL_EXITS); + r = RESUME_HOST; } else { /* * hcall from guest userspace -- send privileged @@ -1016,22 +1032,39 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, kvmppc_core_queue_program(vcpu, ESR_PPR); } - r = RESUME_GUEST; + run-exit_reason = KVM_EXIT_EPAPR_HCALL; Oops, what I have done, I wanted this to be kvmppc_account_exit(vcpu, SYSCALL_EXITS); s/ run-exit_reason = KVM_EXIT_EPAPR_HCALL;/ kvmppc_account_exit(vcpu, SYSCALL_EXITS); -Bharat This looks odd. Your exit reason only changes when you do the hcall exiting, right? You also need to guard user space hcall exits with an ENABLE_CAP. Otherwise older user space will break, as it doesn't know about the exit type yet. So the user space so make enable_cap also? -Bharat Alex break; + } #else - case BOOKE_INTERRUPT_SYSCALL: + case BOOKE_INTERRUPT_SYSCALL: { + int i; + r = RESUME_GUEST; if (!(vcpu-arch.shared-msr MSR_PR) (((u32)kvmppc_get_gpr(vcpu, 0)) == KVM_SC_MAGIC_R0)) { /* KVM PV hypercalls */ - kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu)); - r = RESUME_GUEST; + r = kvmppc_kvm_pv(vcpu); + if (r != EV_UNIMPLEMENTED) { + /* except unimplemented return to guest */ + kvmppc_set_gpr(vcpu, 3, r); + kvmppc_account_exit(vcpu, SYSCALL_EXITS); + r = RESUME_GUEST; + break; + } + /* Exit to userspace for unimplemented hcalls in kvm */ + run-epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11); + run-epapr_hcall.ret = 0; + for (i = 0; i 8; i++) + run-epapr_hcall.args[i] = kvmppc_get_gpr(vcpu, 3 + i); + vcpu-arch.hcall_needed = 1; + run-exit_reason = KVM_EXIT_EPAPR_HCALL; + r = RESUME_HOST; } else { /* Guest syscalls */ kvmppc_booke_queue_irqprio(vcpu, BOOKE_IRQPRIO_SYSCALL); } kvmppc_account_exit(vcpu, SYSCALL_EXITS); - r = RESUME_GUEST; break; + } #endif case BOOKE_INTERRUPT_DTLB_MISS: { diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c index 4e05f8c..6c6199d 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch
RE: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, July 15, 2013 5:16 PM To: Bhushan Bharat-R65777 Cc: kvm@vger.kernel.org; kvm-...@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248 Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm On 15.07.2013, at 13:38, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, July 15, 2013 5:02 PM To: Bhushan Bharat-R65777 Cc: kvm@vger.kernel.org; kvm-...@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm On 15.07.2013, at 13:11, Bharat Bhushan wrote: Exit to guest user space if kvm does not implement the hcall. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/booke.c | 47 +- - arch/powerpc/kvm/powerpc.c |1 + include/uapi/linux/kvm.h |1 + 3 files changed, 42 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 17722d8..c8b41b4 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -1005,9 +1005,25 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, break; #ifdef CONFIG_KVM_BOOKE_HV - case BOOKE_INTERRUPT_HV_SYSCALL: + case BOOKE_INTERRUPT_HV_SYSCALL: { This is getting large. Please extract hcall handling into its own function. Maybe you can merge the HV and non-HV case then too. + int i; if (!(vcpu-arch.shared-msr MSR_PR)) { - kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu)); + r = kvmppc_kvm_pv(vcpu); + if (r != EV_UNIMPLEMENTED) { + /* except unimplemented return to guest */ + kvmppc_set_gpr(vcpu, 3, r); + kvmppc_account_exit(vcpu, SYSCALL_EXITS); + r = RESUME_GUEST; + break; + } + /* Exit to userspace for unimplemented hcalls in kvm */ + run-epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11); + run-epapr_hcall.ret = 0; + for (i = 0; i 8; i++) + run-epapr_hcall.args[i] = kvmppc_get_gpr(vcpu, 3 + i); + vcpu-arch.hcall_needed = 1; + kvmppc_account_exit(vcpu, SYSCALL_EXITS); + r = RESUME_HOST; } else { /* * hcall from guest userspace -- send privileged @@ -1016,22 +1032,39 @@ int kvmppc_handle_exit(struct kvm_run *run, struct +kvm_vcpu *vcpu, kvmppc_core_queue_program(vcpu, ESR_PPR); } - r = RESUME_GUEST; + run-exit_reason = KVM_EXIT_EPAPR_HCALL; Oops, what I have done, I wanted this to be kvmppc_account_exit(vcpu, SYSCALL_EXITS); s/ run-exit_reason = KVM_EXIT_EPAPR_HCALL;/ kvmppc_account_exit(vcpu, SYSCALL_EXITS); -Bharat This looks odd. Your exit reason only changes when you do the hcall exiting, right? You also need to guard user space hcall exits with an ENABLE_CAP. Otherwise older user space will break, as it doesn't know about the exit type yet. So the user space so make enable_cap also? User space needs to call enable_cap on this cap, yes. Otherwise a guest can confuse user space with an hcall exit it can't handle. We do not have enable_cap for book3s, any specific reason why ? -Bharat Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, July 15, 2013 5:20 PM To: Bhushan Bharat-R65777 Cc: kvm@vger.kernel.org; kvm-...@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248; Bhushan Bharat-R65777 Subject: Re: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset On 15.07.2013, at 13:11, Bharat Bhushan wrote: Detect the availability of the reset hcalls by looking at kvm,has-reset property on the /hypervisor node in the device tree passed to the VM and patches the reset mechanism to use reset hcall. This patch uses the reser hcall when kvm,has-reset is there in Your patch description is pretty broken :). Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kernel/epapr_paravirt.c | 12 1 files changed, 12 insertions(+), 0 deletions(-) diff --git a/arch/powerpc/kernel/epapr_paravirt.c b/arch/powerpc/kernel/epapr_paravirt.c index d44a571..651d701 100644 --- a/arch/powerpc/kernel/epapr_paravirt.c +++ b/arch/powerpc/kernel/epapr_paravirt.c @@ -22,6 +22,8 @@ #include asm/cacheflush.h #include asm/code-patching.h #include asm/machdep.h +#include asm/kvm_para.h +#include asm/kvm_host.h Why would we need kvm_host.h? This is guest code. #if !defined(CONFIG_64BIT) || defined(CONFIG_PPC_BOOK3E_64) extern void epapr_ev_idle(void); @@ -30,6 +32,14 @@ extern u32 epapr_ev_idle_start[]; bool epapr_paravirt_enabled; +void epapr_hypercall_reset(char *cmd) { + long ret; + ret = kvm_hypercall0(KVM_HC_VM_RESET); Is this available without CONFIG_KVM_GUEST? kvm_hypercall() simply returns unimplemented for everything when that config option is not set. We are here because we patched the ppc_md.restart to point to new handler. So I think we should patch the ppc_md.restart only if CONFIG_KVM_GUEST is true. + printk(error: system reset returned with error %ld\n, ret); So we should fall back to the normal reset handler here. Do you mean return normally from here, no BUG() etc? -Bharat Alex + BUG(); +} + static int __init epapr_paravirt_init(void) { struct device_node *hyper_node; @@ -58,6 +68,8 @@ static int __init epapr_paravirt_init(void) if (of_get_property(hyper_node, has-idle, NULL)) ppc_md.power_save = epapr_ev_idle; #endif + if (of_get_property(hyper_node, kvm,has-reset, NULL)) + ppc_md.restart = epapr_hypercall_reset; epapr_paravirt_enabled = true; -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, July 15, 2013 8:27 PM To: Bhushan Bharat-R65777 Cc: kvm@vger.kernel.org; kvm-...@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248 Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm On 15.07.2013, at 16:50, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, July 15, 2013 5:16 PM To: Bhushan Bharat-R65777 Cc: kvm@vger.kernel.org; kvm-...@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248 Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm On 15.07.2013, at 13:38, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, July 15, 2013 5:02 PM To: Bhushan Bharat-R65777 Cc: kvm@vger.kernel.org; kvm-...@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm On 15.07.2013, at 13:11, Bharat Bhushan wrote: Exit to guest user space if kvm does not implement the hcall. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/booke.c | 47 +--- -- - arch/powerpc/kvm/powerpc.c |1 + include/uapi/linux/kvm.h |1 + 3 files changed, 42 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 17722d8..c8b41b4 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -1005,9 +1005,25 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, break; #ifdef CONFIG_KVM_BOOKE_HV - case BOOKE_INTERRUPT_HV_SYSCALL: + case BOOKE_INTERRUPT_HV_SYSCALL: { This is getting large. Please extract hcall handling into its own function. Maybe you can merge the HV and non-HV case then too. + int i; if (!(vcpu-arch.shared-msr MSR_PR)) { - kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu)); + r = kvmppc_kvm_pv(vcpu); + if (r != EV_UNIMPLEMENTED) { + /* except unimplemented return to guest */ + kvmppc_set_gpr(vcpu, 3, r); + kvmppc_account_exit(vcpu, SYSCALL_EXITS); + r = RESUME_GUEST; + break; + } + /* Exit to userspace for unimplemented hcalls in kvm */ + run-epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11); + run-epapr_hcall.ret = 0; + for (i = 0; i 8; i++) + run-epapr_hcall.args[i] = kvmppc_get_gpr(vcpu, 3 + i); + vcpu-arch.hcall_needed = 1; + kvmppc_account_exit(vcpu, SYSCALL_EXITS); + r = RESUME_HOST; } else { /* * hcall from guest userspace -- send privileged @@ -1016,22 +1032,39 @@ int kvmppc_handle_exit(struct kvm_run *run, struct +kvm_vcpu *vcpu, kvmppc_core_queue_program(vcpu, ESR_PPR); } - r = RESUME_GUEST; + run-exit_reason = KVM_EXIT_EPAPR_HCALL; Oops, what I have done, I wanted this to be kvmppc_account_exit(vcpu, SYSCALL_EXITS); s/ run-exit_reason = KVM_EXIT_EPAPR_HCALL;/ kvmppc_account_exit(vcpu, SYSCALL_EXITS); -Bharat This looks odd. Your exit reason only changes when you do the hcall exiting, right? You also need to guard user space hcall exits with an ENABLE_CAP. Otherwise older user space will break, as it doesn't know about the exit type yet. So the user space so make enable_cap also? User space needs to call enable_cap on this cap, yes. Otherwise a guest can confuse user space with an hcall exit it can't handle. We do not have enable_cap for book3s, any specific reason why ? We do. If you enable PAPR, you get PAPR hcalls. If you enable OSI, you get OSI hcalls. Oh, We check this on book3s_PR and book3s_HV. KVM hcalls on book3s don't return to user space. It exits, is not it? arch/powerpc/kvm/book3s_pr.c exits with KVM_EXIT_PAPR_HCALL. And same in book3s_pv. Btw, Adding this on booke is not a question. I am just understanding book3s. -Bharat Which is something we probably want to change along with this patch set. Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, July 15, 2013 8:40 PM To: Bhushan Bharat-R65777 Cc: kvm@vger.kernel.org; kvm-...@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248 Subject: Re: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset On 15.07.2013, at 17:05, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, July 15, 2013 5:20 PM To: Bhushan Bharat-R65777 Cc: kvm@vger.kernel.org; kvm-...@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248; Bhushan Bharat-R65777 Subject: Re: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset On 15.07.2013, at 13:11, Bharat Bhushan wrote: Detect the availability of the reset hcalls by looking at kvm,has-reset property on the /hypervisor node in the device tree passed to the VM and patches the reset mechanism to use reset hcall. This patch uses the reser hcall when kvm,has-reset is there in Your patch description is pretty broken :). Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kernel/epapr_paravirt.c | 12 1 files changed, 12 insertions(+), 0 deletions(-) diff --git a/arch/powerpc/kernel/epapr_paravirt.c b/arch/powerpc/kernel/epapr_paravirt.c index d44a571..651d701 100644 --- a/arch/powerpc/kernel/epapr_paravirt.c +++ b/arch/powerpc/kernel/epapr_paravirt.c @@ -22,6 +22,8 @@ #include asm/cacheflush.h #include asm/code-patching.h #include asm/machdep.h +#include asm/kvm_para.h +#include asm/kvm_host.h Why would we need kvm_host.h? This is guest code. #if !defined(CONFIG_64BIT) || defined(CONFIG_PPC_BOOK3E_64) extern void epapr_ev_idle(void); @@ -30,6 +32,14 @@ extern u32 epapr_ev_idle_start[]; bool epapr_paravirt_enabled; +void epapr_hypercall_reset(char *cmd) { + long ret; + ret = kvm_hypercall0(KVM_HC_VM_RESET); Is this available without CONFIG_KVM_GUEST? kvm_hypercall() simply returns unimplemented for everything when that config option is not set. We are here because we patched the ppc_md.restart to point to new handler. So I think we should patch the ppc_md.restart only if CONFIG_KVM_GUEST is true. We should only patch it if kvm_para_available(). That should guard us against everything. + printk(error: system reset returned with error %ld\n, ret); So we should fall back to the normal reset handler here. Do you mean return normally from here, no BUG() etc? If we guard the patching against everything, we can treat a broken hcall as BUG. However, if we don't we want to fall back to the normal guts based reset. Will let Scott comment on this? But ppc_md.restart can point to only one handler and during paravirt patching we changed this to new handler. So we cannot jump back to guts type handler -Bharat -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, July 15, 2013 8:59 PM To: Bhushan Bharat-R65777 Cc: kvm@vger.kernel.org; kvm-...@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248 Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm On 15.07.2013, at 17:13, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, July 15, 2013 8:27 PM To: Bhushan Bharat-R65777 Cc: kvm@vger.kernel.org; kvm-...@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248 Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm On 15.07.2013, at 16:50, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, July 15, 2013 5:16 PM To: Bhushan Bharat-R65777 Cc: kvm@vger.kernel.org; kvm-...@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248 Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm On 15.07.2013, at 13:38, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, July 15, 2013 5:02 PM To: Bhushan Bharat-R65777 Cc: kvm@vger.kernel.org; kvm-...@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm On 15.07.2013, at 13:11, Bharat Bhushan wrote: Exit to guest user space if kvm does not implement the hcall. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/booke.c | 47 +- -- -- - arch/powerpc/kvm/powerpc.c |1 + include/uapi/linux/kvm.h |1 + 3 files changed, 42 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 17722d8..c8b41b4 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -1005,9 +1005,25 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, break; #ifdef CONFIG_KVM_BOOKE_HV - case BOOKE_INTERRUPT_HV_SYSCALL: + case BOOKE_INTERRUPT_HV_SYSCALL: { This is getting large. Please extract hcall handling into its own function. Maybe you can merge the HV and non-HV case then too. + int i; if (!(vcpu-arch.shared-msr MSR_PR)) { - kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu)); + r = kvmppc_kvm_pv(vcpu); + if (r != EV_UNIMPLEMENTED) { + /* except unimplemented return to guest */ + kvmppc_set_gpr(vcpu, 3, r); + kvmppc_account_exit(vcpu, SYSCALL_EXITS); + r = RESUME_GUEST; + break; + } + /* Exit to userspace for unimplemented hcalls in kvm */ + run-epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11); + run-epapr_hcall.ret = 0; + for (i = 0; i 8; i++) + run-epapr_hcall.args[i] = kvmppc_get_gpr(vcpu, 3 + i); + vcpu-arch.hcall_needed = 1; + kvmppc_account_exit(vcpu, SYSCALL_EXITS); + r = RESUME_HOST; } else { /* * hcall from guest userspace -- send privileged @@ - 1016,22 +1032,39 @@ int kvmppc_handle_exit(struct kvm_run *run, struct +kvm_vcpu *vcpu, kvmppc_core_queue_program(vcpu, ESR_PPR); } - r = RESUME_GUEST; + run-exit_reason = KVM_EXIT_EPAPR_HCALL; Oops, what I have done, I wanted this to be kvmppc_account_exit(vcpu, SYSCALL_EXITS); s/ run-exit_reason = KVM_EXIT_EPAPR_HCALL;/ kvmppc_account_exit(vcpu, SYSCALL_EXITS); -Bharat This looks odd. Your exit reason only changes when you do the hcall exiting, right? You also need to guard user space hcall exits with an ENABLE_CAP. Otherwise older user space will break, as it doesn't know about the exit type yet. So the user space so make enable_cap also? User space needs to call enable_cap on this cap, yes. Otherwise a guest can confuse user space with an hcall exit it can't handle. We do not have enable_cap for book3s, any specific reason why ? We do. If you enable PAPR, you get PAPR hcalls. If you enable OSI, you get OSI hcalls. Oh, We check this on book3s_PR and book3s_HV. KVM hcalls on book3s don't return to user space. It exits, is not it? arch/powerpc/kvm/book3s_pr.c exits with KVM_EXIT_PAPR_HCALL. And same in book3s_pv. It doesn't even start
RE: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm
-Original Message- From: Wood Scott-B07421 Sent: Monday, July 15, 2013 11:38 PM To: Bhushan Bharat-R65777 Cc: kvm@vger.kernel.org; kvm-...@vger.kernel.org; ag...@suse.de; Yoder Stuart- B08248; Bhushan Bharat-R65777; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm On 07/15/2013 06:11:16 AM, Bharat Bhushan wrote: Exit to guest user space if kvm does not implement the hcall. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/booke.c | 47 +-- arch/powerpc/kvm/powerpc.c |1 + include/uapi/linux/kvm.h |1 + 3 files changed, 42 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 17722d8..c8b41b4 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -1005,9 +1005,25 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, break; #ifdef CONFIG_KVM_BOOKE_HV - case BOOKE_INTERRUPT_HV_SYSCALL: + case BOOKE_INTERRUPT_HV_SYSCALL: { + int i; if (!(vcpu-arch.shared-msr MSR_PR)) { - kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu)); + r = kvmppc_kvm_pv(vcpu); + if (r != EV_UNIMPLEMENTED) { + /* except unimplemented return to guest */ + kvmppc_set_gpr(vcpu, 3, r); + kvmppc_account_exit(vcpu, SYSCALL_EXITS); + r = RESUME_GUEST; + break; + } + /* Exit to userspace for unimplemented hcalls in kvm */ + run-epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11); + run-epapr_hcall.ret = 0; + for (i = 0; i 8; i++) + run-epapr_hcall.args[i] = kvmppc_get_gpr(vcpu, 3 + i); You need to clear the upper half of each register if CONFIG_PPC64=y and MSR_CM is not set. + vcpu-arch.hcall_needed = 1; The existing code for hcall_needed restores 9 return arguments, rather than the 8 that are defined for this interface. Thus, you'll be restoring one word of padding into the guest -- which could be arbitrary userspace data that shouldn't be leaked. r12 is volatile in the ePAPR hcall ABI so simply clobbering it isn't a problem, though. Oops; Not just that, currently this uses struct type papr_hcall while on booke we should use epapr_hcall. I will make a function which will be defined in book3s.c and booke.c to setup hcall return registers accordingly. -Bharat -Scott -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 1/5] powerpc: define ePAPR hcall exit interface
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, July 15, 2013 4:51 PM To: Bhushan Bharat-R65777 Cc: k...@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248; Bhushan Bharat-R65777 Subject: Re: [PATCH 1/5] powerpc: define ePAPR hcall exit interface On 15.07.2013, at 13:11, Bharat Bhushan wrote: This patch defines the ePAPR hcall exit interface to guest user space. The subject line is misleading. This is a kvm patch. Same applies for most other patches. Ok, will make this kvm: powerpc: define ePAPR hcall exit interface Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- Documentation/virtual/kvm/api.txt | 20 include/uapi/linux/kvm.h |7 +++ 2 files changed, 27 insertions(+), 0 deletions(-) diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 66dd2aa..054f2f4 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -2597,6 +2597,26 @@ The possible hypercalls are defined in the Power Architecture Platform Requirements (PAPR) document available from www.power.org (free developer registration required to access it). + /* KVM_EXIT_EPAPR_HCALL */ + struct { + __u64 nr; + __u64 ret; + __u64 args[8]; + } epapr_hcall; + +This is used on PowerPC platforms that support ePAPR hcalls. +It occurs when a guest does a hypercall (as defined in the ePAPR 1.1) +and the hcall is not handled by the kernel. + +The 'nr' field contains the hypercall number (from the guest R11), +and 'args' contains the arguments (from the guest R3 - R10). +Userspace should put the return code in 'ret' and any extra returned +values in args[]. If the VM is not in 64-bit mode KVM zeros the +upper half of each field in the struct. + +As per the ePAPR hcall ABI, the return value is returned to the guest +in R3 and output return values in R4 - R10. + /* KVM_EXIT_S390_TSCH */ struct { __u16 subchannel_id; diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index acccd08..01ee50e 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -171,6 +171,7 @@ struct kvm_pit_config { #define KVM_EXIT_WATCHDOG 21 #define KVM_EXIT_S390_TSCH22 #define KVM_EXIT_EPR 23 +#define KVM_EXIT_EPAPR_HCALL 24 /* For KVM_EXIT_INTERNAL_ERROR */ /* Emulate instruction failed. */ @@ -288,6 +289,12 @@ struct kvm_run { __u64 ret; __u64 args[9]; } papr_hcall; + /* KVM_EXIT_EPAPR_HCALL */ + struct { + __u64 nr; + __u64 ret; + __u64 args[8]; + } epapr_hcall; This should be at the end of the union. Ok. -Bharat Alex /* KVM_EXIT_S390_TSCH */ struct { __u16 subchannel_id; -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, July 15, 2013 5:02 PM To: Bhushan Bharat-R65777 Cc: k...@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm On 15.07.2013, at 13:11, Bharat Bhushan wrote: Exit to guest user space if kvm does not implement the hcall. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/booke.c | 47 +-- arch/powerpc/kvm/powerpc.c |1 + include/uapi/linux/kvm.h |1 + 3 files changed, 42 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 17722d8..c8b41b4 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -1005,9 +1005,25 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, break; #ifdef CONFIG_KVM_BOOKE_HV - case BOOKE_INTERRUPT_HV_SYSCALL: + case BOOKE_INTERRUPT_HV_SYSCALL: { This is getting large. Please extract hcall handling into its own function. Maybe you can merge the HV and non-HV case then too. + int i; if (!(vcpu-arch.shared-msr MSR_PR)) { - kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu)); + r = kvmppc_kvm_pv(vcpu); + if (r != EV_UNIMPLEMENTED) { + /* except unimplemented return to guest */ + kvmppc_set_gpr(vcpu, 3, r); + kvmppc_account_exit(vcpu, SYSCALL_EXITS); + r = RESUME_GUEST; + break; + } + /* Exit to userspace for unimplemented hcalls in kvm */ + run-epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11); + run-epapr_hcall.ret = 0; + for (i = 0; i 8; i++) + run-epapr_hcall.args[i] = kvmppc_get_gpr(vcpu, 3 + i); + vcpu-arch.hcall_needed = 1; + kvmppc_account_exit(vcpu, SYSCALL_EXITS); + r = RESUME_HOST; } else { /* * hcall from guest userspace -- send privileged @@ -1016,22 +1032,39 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, kvmppc_core_queue_program(vcpu, ESR_PPR); } - r = RESUME_GUEST; + run-exit_reason = KVM_EXIT_EPAPR_HCALL; Oops, what I have done, I wanted this to be kvmppc_account_exit(vcpu, SYSCALL_EXITS); s/ run-exit_reason = KVM_EXIT_EPAPR_HCALL;/ kvmppc_account_exit(vcpu, SYSCALL_EXITS); -Bharat This looks odd. Your exit reason only changes when you do the hcall exiting, right? You also need to guard user space hcall exits with an ENABLE_CAP. Otherwise older user space will break, as it doesn't know about the exit type yet. So the user space so make enable_cap also? -Bharat Alex break; + } #else - case BOOKE_INTERRUPT_SYSCALL: + case BOOKE_INTERRUPT_SYSCALL: { + int i; + r = RESUME_GUEST; if (!(vcpu-arch.shared-msr MSR_PR) (((u32)kvmppc_get_gpr(vcpu, 0)) == KVM_SC_MAGIC_R0)) { /* KVM PV hypercalls */ - kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu)); - r = RESUME_GUEST; + r = kvmppc_kvm_pv(vcpu); + if (r != EV_UNIMPLEMENTED) { + /* except unimplemented return to guest */ + kvmppc_set_gpr(vcpu, 3, r); + kvmppc_account_exit(vcpu, SYSCALL_EXITS); + r = RESUME_GUEST; + break; + } + /* Exit to userspace for unimplemented hcalls in kvm */ + run-epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11); + run-epapr_hcall.ret = 0; + for (i = 0; i 8; i++) + run-epapr_hcall.args[i] = kvmppc_get_gpr(vcpu, 3 + i); + vcpu-arch.hcall_needed = 1; + run-exit_reason = KVM_EXIT_EPAPR_HCALL; + r = RESUME_HOST; } else { /* Guest syscalls */ kvmppc_booke_queue_irqprio(vcpu, BOOKE_IRQPRIO_SYSCALL); } kvmppc_account_exit(vcpu, SYSCALL_EXITS); - r = RESUME_GUEST; break; + } #endif case BOOKE_INTERRUPT_DTLB_MISS: { diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c index 4e05f8c..6c6199d 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch
RE: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, July 15, 2013 5:16 PM To: Bhushan Bharat-R65777 Cc: k...@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248 Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm On 15.07.2013, at 13:38, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, July 15, 2013 5:02 PM To: Bhushan Bharat-R65777 Cc: k...@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm On 15.07.2013, at 13:11, Bharat Bhushan wrote: Exit to guest user space if kvm does not implement the hcall. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/booke.c | 47 +- - arch/powerpc/kvm/powerpc.c |1 + include/uapi/linux/kvm.h |1 + 3 files changed, 42 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 17722d8..c8b41b4 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -1005,9 +1005,25 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, break; #ifdef CONFIG_KVM_BOOKE_HV - case BOOKE_INTERRUPT_HV_SYSCALL: + case BOOKE_INTERRUPT_HV_SYSCALL: { This is getting large. Please extract hcall handling into its own function. Maybe you can merge the HV and non-HV case then too. + int i; if (!(vcpu-arch.shared-msr MSR_PR)) { - kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu)); + r = kvmppc_kvm_pv(vcpu); + if (r != EV_UNIMPLEMENTED) { + /* except unimplemented return to guest */ + kvmppc_set_gpr(vcpu, 3, r); + kvmppc_account_exit(vcpu, SYSCALL_EXITS); + r = RESUME_GUEST; + break; + } + /* Exit to userspace for unimplemented hcalls in kvm */ + run-epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11); + run-epapr_hcall.ret = 0; + for (i = 0; i 8; i++) + run-epapr_hcall.args[i] = kvmppc_get_gpr(vcpu, 3 + i); + vcpu-arch.hcall_needed = 1; + kvmppc_account_exit(vcpu, SYSCALL_EXITS); + r = RESUME_HOST; } else { /* * hcall from guest userspace -- send privileged @@ -1016,22 +1032,39 @@ int kvmppc_handle_exit(struct kvm_run *run, struct +kvm_vcpu *vcpu, kvmppc_core_queue_program(vcpu, ESR_PPR); } - r = RESUME_GUEST; + run-exit_reason = KVM_EXIT_EPAPR_HCALL; Oops, what I have done, I wanted this to be kvmppc_account_exit(vcpu, SYSCALL_EXITS); s/ run-exit_reason = KVM_EXIT_EPAPR_HCALL;/ kvmppc_account_exit(vcpu, SYSCALL_EXITS); -Bharat This looks odd. Your exit reason only changes when you do the hcall exiting, right? You also need to guard user space hcall exits with an ENABLE_CAP. Otherwise older user space will break, as it doesn't know about the exit type yet. So the user space so make enable_cap also? User space needs to call enable_cap on this cap, yes. Otherwise a guest can confuse user space with an hcall exit it can't handle. We do not have enable_cap for book3s, any specific reason why ? -Bharat Alex -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, July 15, 2013 5:20 PM To: Bhushan Bharat-R65777 Cc: k...@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248; Bhushan Bharat-R65777 Subject: Re: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset On 15.07.2013, at 13:11, Bharat Bhushan wrote: Detect the availability of the reset hcalls by looking at kvm,has-reset property on the /hypervisor node in the device tree passed to the VM and patches the reset mechanism to use reset hcall. This patch uses the reser hcall when kvm,has-reset is there in Your patch description is pretty broken :). Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kernel/epapr_paravirt.c | 12 1 files changed, 12 insertions(+), 0 deletions(-) diff --git a/arch/powerpc/kernel/epapr_paravirt.c b/arch/powerpc/kernel/epapr_paravirt.c index d44a571..651d701 100644 --- a/arch/powerpc/kernel/epapr_paravirt.c +++ b/arch/powerpc/kernel/epapr_paravirt.c @@ -22,6 +22,8 @@ #include asm/cacheflush.h #include asm/code-patching.h #include asm/machdep.h +#include asm/kvm_para.h +#include asm/kvm_host.h Why would we need kvm_host.h? This is guest code. #if !defined(CONFIG_64BIT) || defined(CONFIG_PPC_BOOK3E_64) extern void epapr_ev_idle(void); @@ -30,6 +32,14 @@ extern u32 epapr_ev_idle_start[]; bool epapr_paravirt_enabled; +void epapr_hypercall_reset(char *cmd) { + long ret; + ret = kvm_hypercall0(KVM_HC_VM_RESET); Is this available without CONFIG_KVM_GUEST? kvm_hypercall() simply returns unimplemented for everything when that config option is not set. We are here because we patched the ppc_md.restart to point to new handler. So I think we should patch the ppc_md.restart only if CONFIG_KVM_GUEST is true. + printk(error: system reset returned with error %ld\n, ret); So we should fall back to the normal reset handler here. Do you mean return normally from here, no BUG() etc? -Bharat Alex + BUG(); +} + static int __init epapr_paravirt_init(void) { struct device_node *hyper_node; @@ -58,6 +68,8 @@ static int __init epapr_paravirt_init(void) if (of_get_property(hyper_node, has-idle, NULL)) ppc_md.power_save = epapr_ev_idle; #endif + if (of_get_property(hyper_node, kvm,has-reset, NULL)) + ppc_md.restart = epapr_hypercall_reset; epapr_paravirt_enabled = true; -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, July 15, 2013 8:27 PM To: Bhushan Bharat-R65777 Cc: k...@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248 Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm On 15.07.2013, at 16:50, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, July 15, 2013 5:16 PM To: Bhushan Bharat-R65777 Cc: k...@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248 Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm On 15.07.2013, at 13:38, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, July 15, 2013 5:02 PM To: Bhushan Bharat-R65777 Cc: k...@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm On 15.07.2013, at 13:11, Bharat Bhushan wrote: Exit to guest user space if kvm does not implement the hcall. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/booke.c | 47 +--- -- - arch/powerpc/kvm/powerpc.c |1 + include/uapi/linux/kvm.h |1 + 3 files changed, 42 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 17722d8..c8b41b4 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -1005,9 +1005,25 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, break; #ifdef CONFIG_KVM_BOOKE_HV - case BOOKE_INTERRUPT_HV_SYSCALL: + case BOOKE_INTERRUPT_HV_SYSCALL: { This is getting large. Please extract hcall handling into its own function. Maybe you can merge the HV and non-HV case then too. + int i; if (!(vcpu-arch.shared-msr MSR_PR)) { - kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu)); + r = kvmppc_kvm_pv(vcpu); + if (r != EV_UNIMPLEMENTED) { + /* except unimplemented return to guest */ + kvmppc_set_gpr(vcpu, 3, r); + kvmppc_account_exit(vcpu, SYSCALL_EXITS); + r = RESUME_GUEST; + break; + } + /* Exit to userspace for unimplemented hcalls in kvm */ + run-epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11); + run-epapr_hcall.ret = 0; + for (i = 0; i 8; i++) + run-epapr_hcall.args[i] = kvmppc_get_gpr(vcpu, 3 + i); + vcpu-arch.hcall_needed = 1; + kvmppc_account_exit(vcpu, SYSCALL_EXITS); + r = RESUME_HOST; } else { /* * hcall from guest userspace -- send privileged @@ -1016,22 +1032,39 @@ int kvmppc_handle_exit(struct kvm_run *run, struct +kvm_vcpu *vcpu, kvmppc_core_queue_program(vcpu, ESR_PPR); } - r = RESUME_GUEST; + run-exit_reason = KVM_EXIT_EPAPR_HCALL; Oops, what I have done, I wanted this to be kvmppc_account_exit(vcpu, SYSCALL_EXITS); s/ run-exit_reason = KVM_EXIT_EPAPR_HCALL;/ kvmppc_account_exit(vcpu, SYSCALL_EXITS); -Bharat This looks odd. Your exit reason only changes when you do the hcall exiting, right? You also need to guard user space hcall exits with an ENABLE_CAP. Otherwise older user space will break, as it doesn't know about the exit type yet. So the user space so make enable_cap also? User space needs to call enable_cap on this cap, yes. Otherwise a guest can confuse user space with an hcall exit it can't handle. We do not have enable_cap for book3s, any specific reason why ? We do. If you enable PAPR, you get PAPR hcalls. If you enable OSI, you get OSI hcalls. Oh, We check this on book3s_PR and book3s_HV. KVM hcalls on book3s don't return to user space. It exits, is not it? arch/powerpc/kvm/book3s_pr.c exits with KVM_EXIT_PAPR_HCALL. And same in book3s_pv. Btw, Adding this on booke is not a question. I am just understanding book3s. -Bharat Which is something we probably want to change along with this patch set. Alex -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, July 15, 2013 8:40 PM To: Bhushan Bharat-R65777 Cc: k...@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248 Subject: Re: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset On 15.07.2013, at 17:05, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, July 15, 2013 5:20 PM To: Bhushan Bharat-R65777 Cc: k...@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248; Bhushan Bharat-R65777 Subject: Re: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset On 15.07.2013, at 13:11, Bharat Bhushan wrote: Detect the availability of the reset hcalls by looking at kvm,has-reset property on the /hypervisor node in the device tree passed to the VM and patches the reset mechanism to use reset hcall. This patch uses the reser hcall when kvm,has-reset is there in Your patch description is pretty broken :). Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kernel/epapr_paravirt.c | 12 1 files changed, 12 insertions(+), 0 deletions(-) diff --git a/arch/powerpc/kernel/epapr_paravirt.c b/arch/powerpc/kernel/epapr_paravirt.c index d44a571..651d701 100644 --- a/arch/powerpc/kernel/epapr_paravirt.c +++ b/arch/powerpc/kernel/epapr_paravirt.c @@ -22,6 +22,8 @@ #include asm/cacheflush.h #include asm/code-patching.h #include asm/machdep.h +#include asm/kvm_para.h +#include asm/kvm_host.h Why would we need kvm_host.h? This is guest code. #if !defined(CONFIG_64BIT) || defined(CONFIG_PPC_BOOK3E_64) extern void epapr_ev_idle(void); @@ -30,6 +32,14 @@ extern u32 epapr_ev_idle_start[]; bool epapr_paravirt_enabled; +void epapr_hypercall_reset(char *cmd) { + long ret; + ret = kvm_hypercall0(KVM_HC_VM_RESET); Is this available without CONFIG_KVM_GUEST? kvm_hypercall() simply returns unimplemented for everything when that config option is not set. We are here because we patched the ppc_md.restart to point to new handler. So I think we should patch the ppc_md.restart only if CONFIG_KVM_GUEST is true. We should only patch it if kvm_para_available(). That should guard us against everything. + printk(error: system reset returned with error %ld\n, ret); So we should fall back to the normal reset handler here. Do you mean return normally from here, no BUG() etc? If we guard the patching against everything, we can treat a broken hcall as BUG. However, if we don't we want to fall back to the normal guts based reset. Will let Scott comment on this? But ppc_md.restart can point to only one handler and during paravirt patching we changed this to new handler. So we cannot jump back to guts type handler -Bharat -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, July 15, 2013 8:59 PM To: Bhushan Bharat-R65777 Cc: k...@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248 Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm On 15.07.2013, at 17:13, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, July 15, 2013 8:27 PM To: Bhushan Bharat-R65777 Cc: k...@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248 Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm On 15.07.2013, at 16:50, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, July 15, 2013 5:16 PM To: Bhushan Bharat-R65777 Cc: k...@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248 Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm On 15.07.2013, at 13:38, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, July 15, 2013 5:02 PM To: Bhushan Bharat-R65777 Cc: k...@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm On 15.07.2013, at 13:11, Bharat Bhushan wrote: Exit to guest user space if kvm does not implement the hcall. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/booke.c | 47 +- -- -- - arch/powerpc/kvm/powerpc.c |1 + include/uapi/linux/kvm.h |1 + 3 files changed, 42 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 17722d8..c8b41b4 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -1005,9 +1005,25 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, break; #ifdef CONFIG_KVM_BOOKE_HV - case BOOKE_INTERRUPT_HV_SYSCALL: + case BOOKE_INTERRUPT_HV_SYSCALL: { This is getting large. Please extract hcall handling into its own function. Maybe you can merge the HV and non-HV case then too. + int i; if (!(vcpu-arch.shared-msr MSR_PR)) { - kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu)); + r = kvmppc_kvm_pv(vcpu); + if (r != EV_UNIMPLEMENTED) { + /* except unimplemented return to guest */ + kvmppc_set_gpr(vcpu, 3, r); + kvmppc_account_exit(vcpu, SYSCALL_EXITS); + r = RESUME_GUEST; + break; + } + /* Exit to userspace for unimplemented hcalls in kvm */ + run-epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11); + run-epapr_hcall.ret = 0; + for (i = 0; i 8; i++) + run-epapr_hcall.args[i] = kvmppc_get_gpr(vcpu, 3 + i); + vcpu-arch.hcall_needed = 1; + kvmppc_account_exit(vcpu, SYSCALL_EXITS); + r = RESUME_HOST; } else { /* * hcall from guest userspace -- send privileged @@ - 1016,22 +1032,39 @@ int kvmppc_handle_exit(struct kvm_run *run, struct +kvm_vcpu *vcpu, kvmppc_core_queue_program(vcpu, ESR_PPR); } - r = RESUME_GUEST; + run-exit_reason = KVM_EXIT_EPAPR_HCALL; Oops, what I have done, I wanted this to be kvmppc_account_exit(vcpu, SYSCALL_EXITS); s/ run-exit_reason = KVM_EXIT_EPAPR_HCALL;/ kvmppc_account_exit(vcpu, SYSCALL_EXITS); -Bharat This looks odd. Your exit reason only changes when you do the hcall exiting, right? You also need to guard user space hcall exits with an ENABLE_CAP. Otherwise older user space will break, as it doesn't know about the exit type yet. So the user space so make enable_cap also? User space needs to call enable_cap on this cap, yes. Otherwise a guest can confuse user space with an hcall exit it can't handle. We do not have enable_cap for book3s, any specific reason why ? We do. If you enable PAPR, you get PAPR hcalls. If you enable OSI, you get OSI hcalls. Oh, We check this on book3s_PR and book3s_HV. KVM hcalls on book3s don't return to user space. It exits, is not it? arch/powerpc/kvm/book3s_pr.c exits with KVM_EXIT_PAPR_HCALL. And same in book3s_pv. It doesn't even start
RE: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm
-Original Message- From: Wood Scott-B07421 Sent: Monday, July 15, 2013 11:38 PM To: Bhushan Bharat-R65777 Cc: k...@vger.kernel.org; kvm-ppc@vger.kernel.org; ag...@suse.de; Yoder Stuart- B08248; Bhushan Bharat-R65777; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm On 07/15/2013 06:11:16 AM, Bharat Bhushan wrote: Exit to guest user space if kvm does not implement the hcall. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/booke.c | 47 +-- arch/powerpc/kvm/powerpc.c |1 + include/uapi/linux/kvm.h |1 + 3 files changed, 42 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 17722d8..c8b41b4 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -1005,9 +1005,25 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, break; #ifdef CONFIG_KVM_BOOKE_HV - case BOOKE_INTERRUPT_HV_SYSCALL: + case BOOKE_INTERRUPT_HV_SYSCALL: { + int i; if (!(vcpu-arch.shared-msr MSR_PR)) { - kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu)); + r = kvmppc_kvm_pv(vcpu); + if (r != EV_UNIMPLEMENTED) { + /* except unimplemented return to guest */ + kvmppc_set_gpr(vcpu, 3, r); + kvmppc_account_exit(vcpu, SYSCALL_EXITS); + r = RESUME_GUEST; + break; + } + /* Exit to userspace for unimplemented hcalls in kvm */ + run-epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11); + run-epapr_hcall.ret = 0; + for (i = 0; i 8; i++) + run-epapr_hcall.args[i] = kvmppc_get_gpr(vcpu, 3 + i); You need to clear the upper half of each register if CONFIG_PPC64=y and MSR_CM is not set. + vcpu-arch.hcall_needed = 1; The existing code for hcall_needed restores 9 return arguments, rather than the 8 that are defined for this interface. Thus, you'll be restoring one word of padding into the guest -- which could be arbitrary userspace data that shouldn't be leaked. r12 is volatile in the ePAPR hcall ABI so simply clobbering it isn't a problem, though. Oops; Not just that, currently this uses struct type papr_hcall while on booke we should use epapr_hcall. I will make a function which will be defined in book3s.c and booke.c to setup hcall return registers accordingly. -Bharat -Scott -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 3/6 v5] powerpc: export debug register save function for KVM
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, June 24, 2013 3:03 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421; tiejun.c...@windriver.com; Bhushan Bharat-R65777 Subject: Re: [PATCH 3/6 v5] powerpc: export debug register save function for KVM On 24.06.2013, at 11:08, Bharat Bhushan wrote: KVM need this function when switching from vcpu to user-space thread. My subsequent patch will use this function. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/include/asm/switch_to.h |4 arch/powerpc/kernel/process.c|3 ++- 2 files changed, 6 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/include/asm/switch_to.h b/arch/powerpc/include/asm/switch_to.h index 200d763..50b357f 100644 --- a/arch/powerpc/include/asm/switch_to.h +++ b/arch/powerpc/include/asm/switch_to.h @@ -30,6 +30,10 @@ extern void enable_kernel_spe(void); extern void giveup_spe(struct task_struct *); extern void load_up_spe(struct task_struct *); +#ifdef CONFIG_PPC_ADV_DEBUG_REGS +extern void switch_booke_debug_regs(struct thread_struct +*new_thread); #endif + #ifndef CONFIG_SMP extern void discard_lazy_cpu_state(void); #else diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index 01ff496..3375cb7 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -362,12 +362,13 @@ static void prime_debug_regs(struct thread_struct *thread) * debug registers, set the debug registers from the values * stored in the new thread. */ -static void switch_booke_debug_regs(struct thread_struct *new_thread) +void switch_booke_debug_regs(struct thread_struct *new_thread) { if ((current-thread.debug.dbcr0 DBCR0_IDM) || (new_thread-debug.dbcr0 DBCR0_IDM)) prime_debug_regs(new_thread); } +EXPORT_SYMBOL(switch_booke_debug_regs); EXPORT_SYMBOL_GPL? Oops, I missed this comment. Will correct in next version. -Bharat Alex #else /* !CONFIG_PPC_ADV_DEBUG_REGS */ #ifndef CONFIG_HAVE_HW_BREAKPOINT static void set_debug_reg_defaults(struct thread_struct *thread) -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 6/6 v5] KVM: PPC: Add userspace debug stub support
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, June 24, 2013 4:13 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421; tiejun.c...@windriver.com; Bhushan Bharat-R65777 Subject: Re: [PATCH 6/6 v5] KVM: PPC: Add userspace debug stub support On 24.06.2013, at 11:08, Bharat Bhushan wrote: This patch adds the debug stub support on booke/bookehv. Now QEMU debug stub can use hw breakpoint, watchpoint and software breakpoint to debug guest. This is how we save/restore debug register context when switching between guest, userspace and kernel user-process: When QEMU is running - thread-debug_reg == QEMU debug register context. - Kernel will handle switching the debug register on context switch. - no vcpu_load() called QEMU makes ioctls (except RUN) - This will call vcpu_load() - should not change context. - Some ioctls can change vcpu debug register, context saved in - vcpu-debug_regs QEMU Makes RUN ioctl - Save thread-debug_reg on STACK - Store thread-debug_reg == vcpu-debug_reg load thread-debug_reg - RUN VCPU ( So thread points to vcpu context ) Context switch happens When VCPU running - makes vcpu_load() should not load any context kernel loads the vcpu - context as thread-debug_regs points to vcpu context. On heavyweight_exit - Load the context saved on stack in thread-debug_reg Currently we do not support debug resource emulation to guest, On debug exception, always exit to user space irrespective of user space is expecting the debug exception or not. If this is unexpected exception (breakpoint/watchpoint event not set by userspace) then let us leave the action on user space. This is similar to what it was before, only thing is that now we have proper exit state available to user space. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/include/asm/kvm_host.h |3 + arch/powerpc/include/uapi/asm/kvm.h |1 + arch/powerpc/kvm/booke.c| 233 --- arch/powerpc/kvm/booke.h|5 + 4 files changed, 224 insertions(+), 18 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 838a577..aeb490d 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -524,7 +524,10 @@ struct kvm_vcpu_arch { u32 eptcfg; u32 epr; u32 crit_save; + /* guest debug registers*/ struct debug_reg dbg_reg; + /* hardware visible debug registers when in guest state */ + struct debug_reg shadow_dbg_reg; #endif gpa_t paddr_accessed; gva_t vaddr_accessed; diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h index ded0607..f5077c2 100644 --- a/arch/powerpc/include/uapi/asm/kvm.h +++ b/arch/powerpc/include/uapi/asm/kvm.h @@ -27,6 +27,7 @@ #define __KVM_HAVE_PPC_SMT #define __KVM_HAVE_IRQCHIP #define __KVM_HAVE_IRQ_LINE +#define __KVM_HAVE_GUEST_DEBUG struct kvm_regs { __u64 pc; diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 3e9fc1d..8be3502 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -133,6 +133,29 @@ static void kvmppc_vcpu_sync_fpu(struct kvm_vcpu *vcpu) #endif } +static void kvmppc_vcpu_sync_debug(struct kvm_vcpu *vcpu) { + /* Synchronize guest's desire to get debug interrupts into shadow +MSR */ #ifndef CONFIG_KVM_BOOKE_HV + vcpu-arch.shadow_msr = ~MSR_DE; + vcpu-arch.shadow_msr |= vcpu-arch.shared-msr MSR_DE; #endif + + /* Force enable debug interrupts when user space wants to debug */ + if (vcpu-guest_debug) { +#ifdef CONFIG_KVM_BOOKE_HV + /* +* Since there is no shadow MSR, sync MSR_DE into the guest +* visible MSR. +*/ + vcpu-arch.shared-msr |= MSR_DE; +#else + vcpu-arch.shadow_msr |= MSR_DE; + vcpu-arch.shared-msr = ~MSR_DE; +#endif + } +} + /* * Helper function for full MSR writes. No need to call this if only * EE/CE/ME/DE/RI are changing. @@ -150,6 +173,7 @@ void kvmppc_set_msr(struct kvm_vcpu *vcpu, u32 new_msr) kvmppc_mmu_msr_notify(vcpu, old_msr); kvmppc_vcpu_sync_spe(vcpu); kvmppc_vcpu_sync_fpu(vcpu); + kvmppc_vcpu_sync_debug(vcpu); } static void kvmppc_booke_queue_irqprio(struct kvm_vcpu *vcpu, @@ -655,6 +679,7 @@ int kvmppc_core_check_requests(struct kvm_vcpu *vcpu) int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) { int ret, s; + struct thread_struct thread; #ifdef CONFIG_PPC_FPU unsigned int fpscr; int fpexc_mode; @@ -698,12 +723,21 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) kvmppc_load_guest_fp(vcpu); #endif + /* Switch
RE: [PATCH 3/6 v5] powerpc: export debug register save function for KVM
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, June 24, 2013 3:03 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421; tiejun.c...@windriver.com; Bhushan Bharat-R65777 Subject: Re: [PATCH 3/6 v5] powerpc: export debug register save function for KVM On 24.06.2013, at 11:08, Bharat Bhushan wrote: KVM need this function when switching from vcpu to user-space thread. My subsequent patch will use this function. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/include/asm/switch_to.h |4 arch/powerpc/kernel/process.c|3 ++- 2 files changed, 6 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/include/asm/switch_to.h b/arch/powerpc/include/asm/switch_to.h index 200d763..50b357f 100644 --- a/arch/powerpc/include/asm/switch_to.h +++ b/arch/powerpc/include/asm/switch_to.h @@ -30,6 +30,10 @@ extern void enable_kernel_spe(void); extern void giveup_spe(struct task_struct *); extern void load_up_spe(struct task_struct *); +#ifdef CONFIG_PPC_ADV_DEBUG_REGS +extern void switch_booke_debug_regs(struct thread_struct +*new_thread); #endif + #ifndef CONFIG_SMP extern void discard_lazy_cpu_state(void); #else diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index 01ff496..3375cb7 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -362,12 +362,13 @@ static void prime_debug_regs(struct thread_struct *thread) * debug registers, set the debug registers from the values * stored in the new thread. */ -static void switch_booke_debug_regs(struct thread_struct *new_thread) +void switch_booke_debug_regs(struct thread_struct *new_thread) { if ((current-thread.debug.dbcr0 DBCR0_IDM) || (new_thread-debug.dbcr0 DBCR0_IDM)) prime_debug_regs(new_thread); } +EXPORT_SYMBOL(switch_booke_debug_regs); EXPORT_SYMBOL_GPL? Oops, I missed this comment. Will correct in next version. -Bharat Alex #else /* !CONFIG_PPC_ADV_DEBUG_REGS */ #ifndef CONFIG_HAVE_HW_BREAKPOINT static void set_debug_reg_defaults(struct thread_struct *thread) -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 6/6 v5] KVM: PPC: Add userspace debug stub support
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, June 24, 2013 4:13 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421; tiejun.c...@windriver.com; Bhushan Bharat-R65777 Subject: Re: [PATCH 6/6 v5] KVM: PPC: Add userspace debug stub support On 24.06.2013, at 11:08, Bharat Bhushan wrote: This patch adds the debug stub support on booke/bookehv. Now QEMU debug stub can use hw breakpoint, watchpoint and software breakpoint to debug guest. This is how we save/restore debug register context when switching between guest, userspace and kernel user-process: When QEMU is running - thread-debug_reg == QEMU debug register context. - Kernel will handle switching the debug register on context switch. - no vcpu_load() called QEMU makes ioctls (except RUN) - This will call vcpu_load() - should not change context. - Some ioctls can change vcpu debug register, context saved in - vcpu-debug_regs QEMU Makes RUN ioctl - Save thread-debug_reg on STACK - Store thread-debug_reg == vcpu-debug_reg load thread-debug_reg - RUN VCPU ( So thread points to vcpu context ) Context switch happens When VCPU running - makes vcpu_load() should not load any context kernel loads the vcpu - context as thread-debug_regs points to vcpu context. On heavyweight_exit - Load the context saved on stack in thread-debug_reg Currently we do not support debug resource emulation to guest, On debug exception, always exit to user space irrespective of user space is expecting the debug exception or not. If this is unexpected exception (breakpoint/watchpoint event not set by userspace) then let us leave the action on user space. This is similar to what it was before, only thing is that now we have proper exit state available to user space. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/include/asm/kvm_host.h |3 + arch/powerpc/include/uapi/asm/kvm.h |1 + arch/powerpc/kvm/booke.c| 233 --- arch/powerpc/kvm/booke.h|5 + 4 files changed, 224 insertions(+), 18 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 838a577..aeb490d 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -524,7 +524,10 @@ struct kvm_vcpu_arch { u32 eptcfg; u32 epr; u32 crit_save; + /* guest debug registers*/ struct debug_reg dbg_reg; + /* hardware visible debug registers when in guest state */ + struct debug_reg shadow_dbg_reg; #endif gpa_t paddr_accessed; gva_t vaddr_accessed; diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h index ded0607..f5077c2 100644 --- a/arch/powerpc/include/uapi/asm/kvm.h +++ b/arch/powerpc/include/uapi/asm/kvm.h @@ -27,6 +27,7 @@ #define __KVM_HAVE_PPC_SMT #define __KVM_HAVE_IRQCHIP #define __KVM_HAVE_IRQ_LINE +#define __KVM_HAVE_GUEST_DEBUG struct kvm_regs { __u64 pc; diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 3e9fc1d..8be3502 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -133,6 +133,29 @@ static void kvmppc_vcpu_sync_fpu(struct kvm_vcpu *vcpu) #endif } +static void kvmppc_vcpu_sync_debug(struct kvm_vcpu *vcpu) { + /* Synchronize guest's desire to get debug interrupts into shadow +MSR */ #ifndef CONFIG_KVM_BOOKE_HV + vcpu-arch.shadow_msr = ~MSR_DE; + vcpu-arch.shadow_msr |= vcpu-arch.shared-msr MSR_DE; #endif + + /* Force enable debug interrupts when user space wants to debug */ + if (vcpu-guest_debug) { +#ifdef CONFIG_KVM_BOOKE_HV + /* +* Since there is no shadow MSR, sync MSR_DE into the guest +* visible MSR. +*/ + vcpu-arch.shared-msr |= MSR_DE; +#else + vcpu-arch.shadow_msr |= MSR_DE; + vcpu-arch.shared-msr = ~MSR_DE; +#endif + } +} + /* * Helper function for full MSR writes. No need to call this if only * EE/CE/ME/DE/RI are changing. @@ -150,6 +173,7 @@ void kvmppc_set_msr(struct kvm_vcpu *vcpu, u32 new_msr) kvmppc_mmu_msr_notify(vcpu, old_msr); kvmppc_vcpu_sync_spe(vcpu); kvmppc_vcpu_sync_fpu(vcpu); + kvmppc_vcpu_sync_debug(vcpu); } static void kvmppc_booke_queue_irqprio(struct kvm_vcpu *vcpu, @@ -655,6 +679,7 @@ int kvmppc_core_check_requests(struct kvm_vcpu *vcpu) int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) { int ret, s; + struct thread_struct thread; #ifdef CONFIG_PPC_FPU unsigned int fpscr; int fpexc_mode; @@ -698,12 +723,21 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) kvmppc_load_guest_fp(vcpu); #endif + /* Switch
RE: [PATCH] KVM: PPC: Add userspace debug stub support
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Friday, May 10, 2013 11:14 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421; tiejun.c...@windriver.com Subject: Re: [PATCH] KVM: PPC: Add userspace debug stub support On 10.05.2013, at 19:31, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Friday, May 10, 2013 3:48 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421; tiejun.c...@windriver.com; Bhushan Bharat-R65777 Subject: Re: [PATCH] KVM: PPC: Add userspace debug stub support On 07.05.2013, at 11:40, Bharat Bhushan wrote: This patch adds the debug stub support on booke/bookehv. Now QEMU debug stub can use hw breakpoint, watchpoint and software breakpoint to debug guest. This is how we save/restore debug register context when switching between guest, userspace and kernel user-process: When QEMU is running - thread-debug_reg == QEMU debug register context. - Kernel will handle switching the debug register on context switch. - no vcpu_load() called QEMU makes ioctls (except RUN) - This will call vcpu_load() - should not change context. - Some ioctls can change vcpu debug register, context saved in - vcpu-debug_regs QEMU Makes RUN ioctl - Save thread-debug_reg on STACK - Store thread-debug_reg == vcpu-debug_reg load thread-debug_reg - RUN VCPU ( So thread points to vcpu context ) Context switch happens When VCPU running - makes vcpu_load() should not load any context kernel loads the - vcpu context as thread-debug_regs points to vcpu context. On heavyweight_exit - Load the context saved on stack in thread-debug_reg Currently we do not support debug resource emulation to guest, On debug exception, always exit to user space irrespective of user space is expecting the debug exception or not. If this is unexpected exception (breakpoint/watchpoint event not set by userspace) then let us leave the action on user space. This is similar to what it was before, only thing is that now we have proper exit state available to user space. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/include/asm/kvm_host.h |3 + arch/powerpc/include/uapi/asm/kvm.h |1 + arch/powerpc/kvm/booke.c| 242 - -- arch/powerpc/kvm/booke.h|5 + 4 files changed, 233 insertions(+), 18 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 838a577..1b29945 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -524,7 +524,10 @@ struct kvm_vcpu_arch { u32 eptcfg; u32 epr; u32 crit_save; + /* guest debug registers*/ struct debug_reg dbg_reg; + /* shadow debug registers */ Please be more verbose here. What exactly does this contain? Why do we need shadow and non-shadow registers? The comment as it is reads like /* Add one plus one */ x = 1 + 1; /* * Shadow debug registers hold the debug register content * to be written in h/w debug register on behalf of guest * written value or user space written value. */ /* hardware visible debug registers when in guest state */ + struct debug_reg shadow_dbg_reg; #endif gpa_t paddr_accessed; gva_t vaddr_accessed; diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h index ded0607..f5077c2 100644 --- a/arch/powerpc/include/uapi/asm/kvm.h +++ b/arch/powerpc/include/uapi/asm/kvm.h @@ -27,6 +27,7 @@ #define __KVM_HAVE_PPC_SMT #define __KVM_HAVE_IRQCHIP #define __KVM_HAVE_IRQ_LINE +#define __KVM_HAVE_GUEST_DEBUG struct kvm_regs { __u64 pc; diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index ef99536..6a44ad4 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -133,6 +133,29 @@ static void kvmppc_vcpu_sync_fpu(struct kvm_vcpu *vcpu) #endif } +static void kvmppc_vcpu_sync_debug(struct kvm_vcpu *vcpu) { + /* Synchronize guest's desire to get debug interrupts into shadow +MSR */ #ifndef CONFIG_KVM_BOOKE_HV + vcpu-arch.shadow_msr = ~MSR_DE; + vcpu-arch.shadow_msr |= vcpu-arch.shared-msr MSR_DE; #endif + + /* Force enable debug interrupts when user space wants to debug */ + if (vcpu-guest_debug) { +#ifdef CONFIG_KVM_BOOKE_HV + /* + * Since there is no shadow MSR, sync MSR_DE into the guest + * visible MSR. + */ + vcpu-arch.shared-msr |= MSR_DE; #else + vcpu-arch.shadow_msr |= MSR_DE; + vcpu-arch.shared-msr = ~MSR_DE; #endif + } +} + /* * Helper function for full MSR writes. No need to call this if only * EE/CE/ME/DE/RI are changing
RE: [PATCH] KVM: PPC: Add userspace debug stub support
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Friday, May 10, 2013 11:14 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421; tiejun.c...@windriver.com Subject: Re: [PATCH] KVM: PPC: Add userspace debug stub support On 10.05.2013, at 19:31, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Friday, May 10, 2013 3:48 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421; tiejun.c...@windriver.com; Bhushan Bharat-R65777 Subject: Re: [PATCH] KVM: PPC: Add userspace debug stub support On 07.05.2013, at 11:40, Bharat Bhushan wrote: This patch adds the debug stub support on booke/bookehv. Now QEMU debug stub can use hw breakpoint, watchpoint and software breakpoint to debug guest. This is how we save/restore debug register context when switching between guest, userspace and kernel user-process: When QEMU is running - thread-debug_reg == QEMU debug register context. - Kernel will handle switching the debug register on context switch. - no vcpu_load() called QEMU makes ioctls (except RUN) - This will call vcpu_load() - should not change context. - Some ioctls can change vcpu debug register, context saved in - vcpu-debug_regs QEMU Makes RUN ioctl - Save thread-debug_reg on STACK - Store thread-debug_reg == vcpu-debug_reg load thread-debug_reg - RUN VCPU ( So thread points to vcpu context ) Context switch happens When VCPU running - makes vcpu_load() should not load any context kernel loads the - vcpu context as thread-debug_regs points to vcpu context. On heavyweight_exit - Load the context saved on stack in thread-debug_reg Currently we do not support debug resource emulation to guest, On debug exception, always exit to user space irrespective of user space is expecting the debug exception or not. If this is unexpected exception (breakpoint/watchpoint event not set by userspace) then let us leave the action on user space. This is similar to what it was before, only thing is that now we have proper exit state available to user space. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/include/asm/kvm_host.h |3 + arch/powerpc/include/uapi/asm/kvm.h |1 + arch/powerpc/kvm/booke.c| 242 - -- arch/powerpc/kvm/booke.h|5 + 4 files changed, 233 insertions(+), 18 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 838a577..1b29945 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -524,7 +524,10 @@ struct kvm_vcpu_arch { u32 eptcfg; u32 epr; u32 crit_save; + /* guest debug registers*/ struct debug_reg dbg_reg; + /* shadow debug registers */ Please be more verbose here. What exactly does this contain? Why do we need shadow and non-shadow registers? The comment as it is reads like /* Add one plus one */ x = 1 + 1; /* * Shadow debug registers hold the debug register content * to be written in h/w debug register on behalf of guest * written value or user space written value. */ /* hardware visible debug registers when in guest state */ + struct debug_reg shadow_dbg_reg; #endif gpa_t paddr_accessed; gva_t vaddr_accessed; diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h index ded0607..f5077c2 100644 --- a/arch/powerpc/include/uapi/asm/kvm.h +++ b/arch/powerpc/include/uapi/asm/kvm.h @@ -27,6 +27,7 @@ #define __KVM_HAVE_PPC_SMT #define __KVM_HAVE_IRQCHIP #define __KVM_HAVE_IRQ_LINE +#define __KVM_HAVE_GUEST_DEBUG struct kvm_regs { __u64 pc; diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index ef99536..6a44ad4 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -133,6 +133,29 @@ static void kvmppc_vcpu_sync_fpu(struct kvm_vcpu *vcpu) #endif } +static void kvmppc_vcpu_sync_debug(struct kvm_vcpu *vcpu) { + /* Synchronize guest's desire to get debug interrupts into shadow +MSR */ #ifndef CONFIG_KVM_BOOKE_HV + vcpu-arch.shadow_msr = ~MSR_DE; + vcpu-arch.shadow_msr |= vcpu-arch.shared-msr MSR_DE; #endif + + /* Force enable debug interrupts when user space wants to debug */ + if (vcpu-guest_debug) { +#ifdef CONFIG_KVM_BOOKE_HV + /* + * Since there is no shadow MSR, sync MSR_DE into the guest + * visible MSR. + */ + vcpu-arch.shared-msr |= MSR_DE; #else + vcpu-arch.shadow_msr |= MSR_DE; + vcpu-arch.shared-msr = ~MSR_DE; #endif + } +} + /* * Helper function for full MSR writes. No need to call this if only * EE/CE/ME/DE/RI are changing
RE: [PATCH] KVM: PPC: Add userspace debug stub support
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Friday, May 10, 2013 3:48 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421; tiejun.c...@windriver.com; Bhushan Bharat-R65777 Subject: Re: [PATCH] KVM: PPC: Add userspace debug stub support On 07.05.2013, at 11:40, Bharat Bhushan wrote: This patch adds the debug stub support on booke/bookehv. Now QEMU debug stub can use hw breakpoint, watchpoint and software breakpoint to debug guest. This is how we save/restore debug register context when switching between guest, userspace and kernel user-process: When QEMU is running - thread-debug_reg == QEMU debug register context. - Kernel will handle switching the debug register on context switch. - no vcpu_load() called QEMU makes ioctls (except RUN) - This will call vcpu_load() - should not change context. - Some ioctls can change vcpu debug register, context saved in - vcpu-debug_regs QEMU Makes RUN ioctl - Save thread-debug_reg on STACK - Store thread-debug_reg == vcpu-debug_reg load thread-debug_reg - RUN VCPU ( So thread points to vcpu context ) Context switch happens When VCPU running - makes vcpu_load() should not load any context kernel loads the vcpu - context as thread-debug_regs points to vcpu context. On heavyweight_exit - Load the context saved on stack in thread-debug_reg Currently we do not support debug resource emulation to guest, On debug exception, always exit to user space irrespective of user space is expecting the debug exception or not. If this is unexpected exception (breakpoint/watchpoint event not set by userspace) then let us leave the action on user space. This is similar to what it was before, only thing is that now we have proper exit state available to user space. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/include/asm/kvm_host.h |3 + arch/powerpc/include/uapi/asm/kvm.h |1 + arch/powerpc/kvm/booke.c| 242 --- arch/powerpc/kvm/booke.h|5 + 4 files changed, 233 insertions(+), 18 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 838a577..1b29945 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -524,7 +524,10 @@ struct kvm_vcpu_arch { u32 eptcfg; u32 epr; u32 crit_save; + /* guest debug registers*/ struct debug_reg dbg_reg; + /* shadow debug registers */ Please be more verbose here. What exactly does this contain? Why do we need shadow and non-shadow registers? The comment as it is reads like /* Add one plus one */ x = 1 + 1; /* * Shadow debug registers hold the debug register content * to be written in h/w debug register on behalf of guest * written value or user space written value. */ + struct debug_reg shadow_dbg_reg; #endif gpa_t paddr_accessed; gva_t vaddr_accessed; diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h index ded0607..f5077c2 100644 --- a/arch/powerpc/include/uapi/asm/kvm.h +++ b/arch/powerpc/include/uapi/asm/kvm.h @@ -27,6 +27,7 @@ #define __KVM_HAVE_PPC_SMT #define __KVM_HAVE_IRQCHIP #define __KVM_HAVE_IRQ_LINE +#define __KVM_HAVE_GUEST_DEBUG struct kvm_regs { __u64 pc; diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index ef99536..6a44ad4 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -133,6 +133,29 @@ static void kvmppc_vcpu_sync_fpu(struct kvm_vcpu *vcpu) #endif } +static void kvmppc_vcpu_sync_debug(struct kvm_vcpu *vcpu) { + /* Synchronize guest's desire to get debug interrupts into shadow +MSR */ #ifndef CONFIG_KVM_BOOKE_HV + vcpu-arch.shadow_msr = ~MSR_DE; + vcpu-arch.shadow_msr |= vcpu-arch.shared-msr MSR_DE; #endif + + /* Force enable debug interrupts when user space wants to debug */ + if (vcpu-guest_debug) { +#ifdef CONFIG_KVM_BOOKE_HV + /* +* Since there is no shadow MSR, sync MSR_DE into the guest +* visible MSR. +*/ + vcpu-arch.shared-msr |= MSR_DE; +#else + vcpu-arch.shadow_msr |= MSR_DE; + vcpu-arch.shared-msr = ~MSR_DE; +#endif + } +} + /* * Helper function for full MSR writes. No need to call this if only * EE/CE/ME/DE/RI are changing. @@ -150,6 +173,7 @@ void kvmppc_set_msr(struct kvm_vcpu *vcpu, u32 new_msr) kvmppc_mmu_msr_notify(vcpu, old_msr); kvmppc_vcpu_sync_spe(vcpu); kvmppc_vcpu_sync_fpu(vcpu); + kvmppc_vcpu_sync_debug(vcpu); } static void kvmppc_booke_queue_irqprio(struct kvm_vcpu *vcpu, @@ -655,6 +679,7 @@ int kvmppc_core_check_requests(struct kvm_vcpu *vcpu) int kvmppc_vcpu_run(struct
RE: [PATCH] KVM: PPC: Add userspace debug stub support
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Friday, May 10, 2013 3:48 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421; tiejun.c...@windriver.com; Bhushan Bharat-R65777 Subject: Re: [PATCH] KVM: PPC: Add userspace debug stub support On 07.05.2013, at 11:40, Bharat Bhushan wrote: This patch adds the debug stub support on booke/bookehv. Now QEMU debug stub can use hw breakpoint, watchpoint and software breakpoint to debug guest. This is how we save/restore debug register context when switching between guest, userspace and kernel user-process: When QEMU is running - thread-debug_reg == QEMU debug register context. - Kernel will handle switching the debug register on context switch. - no vcpu_load() called QEMU makes ioctls (except RUN) - This will call vcpu_load() - should not change context. - Some ioctls can change vcpu debug register, context saved in - vcpu-debug_regs QEMU Makes RUN ioctl - Save thread-debug_reg on STACK - Store thread-debug_reg == vcpu-debug_reg load thread-debug_reg - RUN VCPU ( So thread points to vcpu context ) Context switch happens When VCPU running - makes vcpu_load() should not load any context kernel loads the vcpu - context as thread-debug_regs points to vcpu context. On heavyweight_exit - Load the context saved on stack in thread-debug_reg Currently we do not support debug resource emulation to guest, On debug exception, always exit to user space irrespective of user space is expecting the debug exception or not. If this is unexpected exception (breakpoint/watchpoint event not set by userspace) then let us leave the action on user space. This is similar to what it was before, only thing is that now we have proper exit state available to user space. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/include/asm/kvm_host.h |3 + arch/powerpc/include/uapi/asm/kvm.h |1 + arch/powerpc/kvm/booke.c| 242 --- arch/powerpc/kvm/booke.h|5 + 4 files changed, 233 insertions(+), 18 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 838a577..1b29945 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -524,7 +524,10 @@ struct kvm_vcpu_arch { u32 eptcfg; u32 epr; u32 crit_save; + /* guest debug registers*/ struct debug_reg dbg_reg; + /* shadow debug registers */ Please be more verbose here. What exactly does this contain? Why do we need shadow and non-shadow registers? The comment as it is reads like /* Add one plus one */ x = 1 + 1; /* * Shadow debug registers hold the debug register content * to be written in h/w debug register on behalf of guest * written value or user space written value. */ + struct debug_reg shadow_dbg_reg; #endif gpa_t paddr_accessed; gva_t vaddr_accessed; diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h index ded0607..f5077c2 100644 --- a/arch/powerpc/include/uapi/asm/kvm.h +++ b/arch/powerpc/include/uapi/asm/kvm.h @@ -27,6 +27,7 @@ #define __KVM_HAVE_PPC_SMT #define __KVM_HAVE_IRQCHIP #define __KVM_HAVE_IRQ_LINE +#define __KVM_HAVE_GUEST_DEBUG struct kvm_regs { __u64 pc; diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index ef99536..6a44ad4 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -133,6 +133,29 @@ static void kvmppc_vcpu_sync_fpu(struct kvm_vcpu *vcpu) #endif } +static void kvmppc_vcpu_sync_debug(struct kvm_vcpu *vcpu) { + /* Synchronize guest's desire to get debug interrupts into shadow +MSR */ #ifndef CONFIG_KVM_BOOKE_HV + vcpu-arch.shadow_msr = ~MSR_DE; + vcpu-arch.shadow_msr |= vcpu-arch.shared-msr MSR_DE; #endif + + /* Force enable debug interrupts when user space wants to debug */ + if (vcpu-guest_debug) { +#ifdef CONFIG_KVM_BOOKE_HV + /* +* Since there is no shadow MSR, sync MSR_DE into the guest +* visible MSR. +*/ + vcpu-arch.shared-msr |= MSR_DE; +#else + vcpu-arch.shadow_msr |= MSR_DE; + vcpu-arch.shared-msr = ~MSR_DE; +#endif + } +} + /* * Helper function for full MSR writes. No need to call this if only * EE/CE/ME/DE/RI are changing. @@ -150,6 +173,7 @@ void kvmppc_set_msr(struct kvm_vcpu *vcpu, u32 new_msr) kvmppc_mmu_msr_notify(vcpu, old_msr); kvmppc_vcpu_sync_spe(vcpu); kvmppc_vcpu_sync_fpu(vcpu); + kvmppc_vcpu_sync_debug(vcpu); } static void kvmppc_booke_queue_irqprio(struct kvm_vcpu *vcpu, @@ -655,6 +679,7 @@ int kvmppc_core_check_requests(struct kvm_vcpu *vcpu) int kvmppc_vcpu_run(struct
RE: [PATCH v2 2/4] kvm/ppc/booke64: Fix lazy ee handling in kvmppc_handle_exit()
-Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Scott Wood Sent: Friday, May 10, 2013 8:40 AM To: Alexander Graf; Benjamin Herrenschmidt Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; linuxppc-...@lists.ozlabs.org; Wood Scott-B07421 Subject: [PATCH v2 2/4] kvm/ppc/booke64: Fix lazy ee handling in kvmppc_handle_exit() EE is hard-disabled on entry to kvmppc_handle_exit(), so call hard_irq_disable() so that PACA_IRQ_HARD_DIS is set, and soft_enabled is unset. Without this, we get warnings such as arch/powerpc/kernel/time.c:300, and sometimes host kernel hangs. Signed-off-by: Scott Wood scottw...@freescale.com --- arch/powerpc/kvm/booke.c |5 + 1 file changed, 5 insertions(+) diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 1020119..705fc5c 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -833,6 +833,11 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, int r = RESUME_HOST; int s; +#ifdef CONFIG_PPC64 + WARN_ON(local_paca-irq_happened != 0); +#endif + hard_irq_disable(); It is not actually to hard disable as EE is already clear but to make it looks like hard_disable to host. Right? If so, should we write a comment here on why we are doing this? -Bharat + /* update before a new last_exit_type is rewritten */ kvmppc_update_timing_stats(vcpu); -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH v2 4/4] kvm/ppc: IRQ disabling cleanup
-Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Scott Wood Sent: Friday, May 10, 2013 8:40 AM To: Alexander Graf; Benjamin Herrenschmidt Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; linuxppc-...@lists.ozlabs.org; Wood Scott-B07421 Subject: [PATCH v2 4/4] kvm/ppc: IRQ disabling cleanup Simplify the handling of lazy EE by going directly from fully-enabled to hard-disabled. This replaces the lazy_irq_pending() check (including its misplaced kvm_guest_exit() call). As suggested by Tiejun Chen, move the interrupt disabling into kvmppc_prepare_to_enter() rather than have each caller do it. Also move the IRQ enabling on heavyweight exit into kvmppc_prepare_to_enter(). Don't move kvmppc_fix_ee_before_entry() into kvmppc_prepare_to_enter(), so that the caller can avoid marking interrupts enabled earlier than necessary (e.g. book3s_pr waits until after FP save/restore is done). Signed-off-by: Scott Wood scottw...@freescale.com --- arch/powerpc/include/asm/kvm_ppc.h |6 ++ arch/powerpc/kvm/book3s_pr.c | 12 +++- arch/powerpc/kvm/booke.c |9 ++--- arch/powerpc/kvm/powerpc.c | 21 - 4 files changed, 19 insertions(+), 29 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index 6885846..e4474f8 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -404,6 +404,12 @@ static inline void kvmppc_fix_ee_before_entry(void) trace_hardirqs_on(); #ifdef CONFIG_PPC64 + /* + * To avoid races, the caller must have gone directly from having + * interrupts fully-enabled to hard-disabled. + */ + WARN_ON(local_paca-irq_happened != PACA_IRQ_HARD_DIS); + /* Only need to enable IRQs by hard enabling them after this */ local_paca-irq_happened = 0; local_paca-soft_enabled = 1; diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c index 0b97ce4..e61e39e 100644 --- a/arch/powerpc/kvm/book3s_pr.c +++ b/arch/powerpc/kvm/book3s_pr.c @@ -884,14 +884,11 @@ program_interrupt: * and if we really did time things so badly, then we just exit * again due to a host external interrupt. */ - local_irq_disable(); s = kvmppc_prepare_to_enter(vcpu); - if (s = 0) { - local_irq_enable(); + if (s = 0) r = s; - } else { + else kvmppc_fix_ee_before_entry(); - } } trace_kvm_book3s_reenter(r, vcpu); @@ -1121,12 +1118,9 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) * really did time things so badly, then we just exit again due to * a host external interrupt. */ - local_irq_disable(); ret = kvmppc_prepare_to_enter(vcpu); - if (ret = 0) { - local_irq_enable(); + if (ret = 0) goto out; - } /* Save FPU state in stack */ if (current-thread.regs-msr MSR_FP) diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index eb89b83..f7c0111 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -666,10 +666,8 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) return -EINVAL; } - local_irq_disable(); s = kvmppc_prepare_to_enter(vcpu); if (s = 0) { - local_irq_enable(); ret = s; goto out; } @@ -1148,14 +1146,11 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, * aren't already exiting to userspace for some other reason. */ if (!(r RESUME_HOST)) { - local_irq_disable(); Ok, Now we do not soft disable before kvmppc_prapare_to_enter(). s = kvmppc_prepare_to_enter(vcpu); - if (s = 0) { - local_irq_enable(); + if (s = 0) r = (s 2) | RESUME_HOST | (r RESUME_FLAG_NV); - } else { + else kvmppc_fix_ee_before_entry(); - } } return r; diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c index 4e05f8c..f8659aa 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -64,12 +64,14 @@ int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu) { int r = 1; - WARN_ON_ONCE(!irqs_disabled()); + WARN_ON(irqs_disabled()); + hard_irq_disable(); Here we hard disable in kvmppc_prepare_to_enter(), so my comment in other patch about interrupt loss is no more valid. So here MSR.EE = 0 local_paca-soft_enabled = 0 local_paca-irq_happened |= PACA_IRQ_HARD_DIS; + while (true) {
RE: [PATCH v2 3/4] kvm/ppc: Call trace_hardirqs_on before entry
-Original Message- From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On Behalf Of Scott Wood Sent: Friday, May 10, 2013 8:40 AM To: Alexander Graf; Benjamin Herrenschmidt Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; linuxppc-...@lists.ozlabs.org; Wood Scott-B07421 Subject: [PATCH v2 3/4] kvm/ppc: Call trace_hardirqs_on before entry Currently this is only being done on 64-bit. Rather than just move it out of the 64-bit ifdef, move it to kvm_lazy_ee_enable() so that it is consistent with lazy ee state, and so that we don't track more host code as interrupts-enabled than necessary. Rename kvm_lazy_ee_enable() to kvm_fix_ee_before_entry() to reflect that this function now has a role on 32-bit as well. Signed-off-by: Scott Wood scottw...@freescale.com --- arch/powerpc/include/asm/kvm_ppc.h | 11 --- arch/powerpc/kvm/book3s_pr.c |4 ++-- arch/powerpc/kvm/booke.c |4 ++-- arch/powerpc/kvm/powerpc.c |2 -- 4 files changed, 12 insertions(+), 9 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index a5287fe..6885846 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -394,10 +394,15 @@ static inline void kvmppc_mmu_flush_icache(pfn_t pfn) } } -/* Please call after prepare_to_enter. This function puts the lazy ee state - back to normal mode, without actually enabling interrupts. */ -static inline void kvmppc_lazy_ee_enable(void) +/* + * Please call after prepare_to_enter. This function puts the lazy ee and irq + * disabled tracking state back to normal mode, without actually enabling + * interrupts. + */ +static inline void kvmppc_fix_ee_before_entry(void) { + trace_hardirqs_on(); + #ifdef CONFIG_PPC64 /* Only need to enable IRQs by hard enabling them after this */ local_paca-irq_happened = 0; diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c index bdc40b8..0b97ce4 100644 --- a/arch/powerpc/kvm/book3s_pr.c +++ b/arch/powerpc/kvm/book3s_pr.c @@ -890,7 +890,7 @@ program_interrupt: local_irq_enable(); r = s; } else { - kvmppc_lazy_ee_enable(); + kvmppc_fix_ee_before_entry(); } } @@ -1161,7 +1161,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) if (vcpu-arch.shared-msr MSR_FP) kvmppc_handle_ext(vcpu, BOOK3S_INTERRUPT_FP_UNAVAIL, MSR_FP); - kvmppc_lazy_ee_enable(); + kvmppc_fix_ee_before_entry(); ret = __kvmppc_vcpu_run(kvm_run, vcpu); diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 705fc5c..eb89b83 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -673,7 +673,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) ret = s; goto out; } - kvmppc_lazy_ee_enable(); + kvmppc_fix_ee_before_entry(); local_irq_disable() is called before kvmppc_prepare_to_enter(). Now we put the irq_happend and soft_enabled back to previous state without checking for any interrupt happened in between. If any interrupt happens in between, will not that be lost? -Bharat kvm_guest_enter(); @@ -1154,7 +1154,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, local_irq_enable(); r = (s 2) | RESUME_HOST | (r RESUME_FLAG_NV); } else { - kvmppc_lazy_ee_enable(); + kvmppc_fix_ee_before_entry(); } } diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c index 6316ee3..4e05f8c 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -117,8 +117,6 @@ int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu) kvm_guest_exit(); continue; } - - trace_hardirqs_on(); #endif kvm_guest_enter(); -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH v2 4/4] kvm/ppc: IRQ disabling cleanup
-Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Scott Wood Sent: Friday, May 10, 2013 8:40 AM To: Alexander Graf; Benjamin Herrenschmidt Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; linuxppc-...@lists.ozlabs.org; Wood Scott-B07421 Subject: [PATCH v2 4/4] kvm/ppc: IRQ disabling cleanup Simplify the handling of lazy EE by going directly from fully-enabled to hard-disabled. This replaces the lazy_irq_pending() check (including its misplaced kvm_guest_exit() call). As suggested by Tiejun Chen, move the interrupt disabling into kvmppc_prepare_to_enter() rather than have each caller do it. Also move the IRQ enabling on heavyweight exit into kvmppc_prepare_to_enter(). Don't move kvmppc_fix_ee_before_entry() into kvmppc_prepare_to_enter(), so that the caller can avoid marking interrupts enabled earlier than necessary (e.g. book3s_pr waits until after FP save/restore is done). Signed-off-by: Scott Wood scottw...@freescale.com --- arch/powerpc/include/asm/kvm_ppc.h |6 ++ arch/powerpc/kvm/book3s_pr.c | 12 +++- arch/powerpc/kvm/booke.c |9 ++--- arch/powerpc/kvm/powerpc.c | 21 - 4 files changed, 19 insertions(+), 29 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index 6885846..e4474f8 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -404,6 +404,12 @@ static inline void kvmppc_fix_ee_before_entry(void) trace_hardirqs_on(); #ifdef CONFIG_PPC64 + /* + * To avoid races, the caller must have gone directly from having + * interrupts fully-enabled to hard-disabled. + */ + WARN_ON(local_paca-irq_happened != PACA_IRQ_HARD_DIS); + /* Only need to enable IRQs by hard enabling them after this */ local_paca-irq_happened = 0; local_paca-soft_enabled = 1; diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c index 0b97ce4..e61e39e 100644 --- a/arch/powerpc/kvm/book3s_pr.c +++ b/arch/powerpc/kvm/book3s_pr.c @@ -884,14 +884,11 @@ program_interrupt: * and if we really did time things so badly, then we just exit * again due to a host external interrupt. */ - local_irq_disable(); s = kvmppc_prepare_to_enter(vcpu); - if (s = 0) { - local_irq_enable(); + if (s = 0) r = s; - } else { + else kvmppc_fix_ee_before_entry(); - } } trace_kvm_book3s_reenter(r, vcpu); @@ -1121,12 +1118,9 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) * really did time things so badly, then we just exit again due to * a host external interrupt. */ - local_irq_disable(); ret = kvmppc_prepare_to_enter(vcpu); - if (ret = 0) { - local_irq_enable(); + if (ret = 0) goto out; - } /* Save FPU state in stack */ if (current-thread.regs-msr MSR_FP) diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index eb89b83..f7c0111 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -666,10 +666,8 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) return -EINVAL; } - local_irq_disable(); s = kvmppc_prepare_to_enter(vcpu); if (s = 0) { - local_irq_enable(); ret = s; goto out; } @@ -1148,14 +1146,11 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, * aren't already exiting to userspace for some other reason. */ if (!(r RESUME_HOST)) { - local_irq_disable(); Ok, Now we do not soft disable before kvmppc_prapare_to_enter(). s = kvmppc_prepare_to_enter(vcpu); - if (s = 0) { - local_irq_enable(); + if (s = 0) r = (s 2) | RESUME_HOST | (r RESUME_FLAG_NV); - } else { + else kvmppc_fix_ee_before_entry(); - } } return r; diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c index 4e05f8c..f8659aa 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -64,12 +64,14 @@ int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu) { int r = 1; - WARN_ON_ONCE(!irqs_disabled()); + WARN_ON(irqs_disabled()); + hard_irq_disable(); Here we hard disable in kvmppc_prepare_to_enter(), so my comment in other patch about interrupt loss is no more valid. So here MSR.EE = 0 local_paca-soft_enabled = 0 local_paca-irq_happened |= PACA_IRQ_HARD_DIS; + while (true)
RE: [PATCH 7/7 v3] KVM: PPC: Add userspace debug stub support
-Original Message- From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Friday, May 03, 2013 6:48 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421 Subject: Re: [PATCH 7/7 v3] KVM: PPC: Add userspace debug stub support On 03.05.2013, at 15:11, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Friday, May 03, 2013 6:00 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421 Subject: Re: [PATCH 7/7 v3] KVM: PPC: Add userspace debug stub support On 03.05.2013, at 13:08, Alexander Graf wrote: Am 03.05.2013 um 12:48 schrieb Bhushan Bharat-R65777 r65...@freescale.com: +static void kvmppc_booke_vcpu_load_debug_regs(struct kvm_vcpu +*vcpu) { +if (!vcpu-arch.debug_active) +return; + +/* Disable all debug events and clead pending debug events */ +mtspr(SPRN_DBCR0, 0x0); +kvmppc_clear_dbsr(); + +/* + * Check whether guest still need debug resource, if not then there + * is no need to restore guest context. + */ +if (!vcpu-arch.shadow_dbg_reg.dbcr0) +return; + +/* Load Guest Context */ +mtspr(SPRN_DBCR1, vcpu-arch.shadow_dbg_reg.dbcr1); +mtspr(SPRN_DBCR2, vcpu-arch.shadow_dbg_reg.dbcr2); #ifdef +CONFIG_KVM_E500MC +mtspr(SPRN_DBCR4, vcpu-arch.shadow_dbg_reg.dbcr4); You need to make sure DBCR4 is 0 when you leave things back to normal user space. Otherwise guest debug can interfere with host debug. ok +#endif +mtspr(SPRN_IAC1, vcpu-arch.shadow_dbg_reg.iac[0]); +mtspr(SPRN_IAC2, vcpu-arch.shadow_dbg_reg.iac[1]); +#if CONFIG_PPC_ADV_DEBUG_IACS 2 +mtspr(SPRN_IAC3, vcpu-arch.shadow_dbg_reg.iac[2]); +mtspr(SPRN_IAC4, vcpu-arch.shadow_dbg_reg.iac[3]); +#endif +mtspr(SPRN_DAC1, vcpu-arch.shadow_dbg_reg.dac[0]); +mtspr(SPRN_DAC2, vcpu-arch.shadow_dbg_reg.dac[1]); + +/* Enable debug events after other debug registers restored */ +mtspr(SPRN_DBCR0, vcpu-arch.shadow_dbg_reg.dbcr0); } All of the code above looks suspiciously similar to prime_debug_regs();. Can't we somehow reuse that? I think we can if - Save thread-debug_regs in local data structure Yes, it can even be on the stack. - Load vcpu-arch-debug_regs in thread-debug_regs - Call prime_debug_regs(); - Restore thread-debug_regs from local save values in first step On heavyweight exit, based on the values on stack, yes. This is how I think we can save/restore debug context. Please correct if I am missing something. Sounds about right :) Actually, what happens if a guest breakpoint is set to a kernel address that happens to be within the scope of kvm code? You mean address of kvm code in guest or host? If host, we already mentioned that we do not support that. Right? QEMU wants to debug the guest at address 0xc123. kvm_run happens to be at that address. We switch the debug registers through prime_debug_regs. Will the host kernel receive a debug interrupt when it runs kvm_run()? No, On e500v2, we uses DBCR1 and DBCR2 to not allow debug events when MSR.PR = 0 On e500mc+, we uses EPCR.DUVD to not allow debug events when in hypervisor mode. -Bharat Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 7/7 v3] KVM: PPC: Add userspace debug stub support
+static void kvmppc_booke_vcpu_load_debug_regs(struct kvm_vcpu +*vcpu) { + if (!vcpu-arch.debug_active) + return; + + /* Disable all debug events and clead pending debug events */ + mtspr(SPRN_DBCR0, 0x0); + kvmppc_clear_dbsr(); + + /* + * Check whether guest still need debug resource, if not then there + * is no need to restore guest context. + */ + if (!vcpu-arch.shadow_dbg_reg.dbcr0) + return; + + /* Load Guest Context */ + mtspr(SPRN_DBCR1, vcpu-arch.shadow_dbg_reg.dbcr1); + mtspr(SPRN_DBCR2, vcpu-arch.shadow_dbg_reg.dbcr2); #ifdef +CONFIG_KVM_E500MC + mtspr(SPRN_DBCR4, vcpu-arch.shadow_dbg_reg.dbcr4); You need to make sure DBCR4 is 0 when you leave things back to normal user space. Otherwise guest debug can interfere with host debug. ok +#endif + mtspr(SPRN_IAC1, vcpu-arch.shadow_dbg_reg.iac[0]); + mtspr(SPRN_IAC2, vcpu-arch.shadow_dbg_reg.iac[1]); +#if CONFIG_PPC_ADV_DEBUG_IACS 2 + mtspr(SPRN_IAC3, vcpu-arch.shadow_dbg_reg.iac[2]); + mtspr(SPRN_IAC4, vcpu-arch.shadow_dbg_reg.iac[3]); +#endif + mtspr(SPRN_DAC1, vcpu-arch.shadow_dbg_reg.dac[0]); + mtspr(SPRN_DAC2, vcpu-arch.shadow_dbg_reg.dac[1]); + + /* Enable debug events after other debug registers restored */ + mtspr(SPRN_DBCR0, vcpu-arch.shadow_dbg_reg.dbcr0); } All of the code above looks suspiciously similar to prime_debug_regs();. Can't we somehow reuse that? I think we can if - Save thread-debug_regs in local data structure Yes, it can even be on the stack. - Load vcpu-arch-debug_regs in thread-debug_regs - Call prime_debug_regs(); - Restore thread-debug_regs from local save values in first step On heavyweight exit, based on the values on stack, yes. This is how I think we can save/restore debug context. Please correct if I am missing something. 1) When QEMU is running - thread-debug_reg == QEMU debug register context. - Kernel will handle switching the debug register on context switch. - no vcpu_load() called 2) QEMU makes ioctls (except RUN) - This will call vcpu_load() - should not change context. - Some ioctls can change vcpu debug register, context saved in vcpu-debug_regs 3) QEMU Makes RUN ioctl - Save thread-debug_reg on STACK - Store thread-debug_reg == vcpu-debug_reg - load thread-debug_reg - RUN VCPU ( So thread points to vcpu context ) 4) Context switch happens When VCPU running - makes vcpu_load() should not load any context - kernel loads the vcpu context as thread-debug_regs points to vcpu context. 5) On heavyweight_exit - Load the context saved on stack in thread-debug_reg Thanks -Bharat -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 7/7 v3] KVM: PPC: Add userspace debug stub support
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Friday, May 03, 2013 6:00 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421 Subject: Re: [PATCH 7/7 v3] KVM: PPC: Add userspace debug stub support On 03.05.2013, at 13:08, Alexander Graf wrote: Am 03.05.2013 um 12:48 schrieb Bhushan Bharat-R65777 r65...@freescale.com: +static void kvmppc_booke_vcpu_load_debug_regs(struct kvm_vcpu +*vcpu) { +if (!vcpu-arch.debug_active) +return; + +/* Disable all debug events and clead pending debug events */ +mtspr(SPRN_DBCR0, 0x0); +kvmppc_clear_dbsr(); + +/* + * Check whether guest still need debug resource, if not then there + * is no need to restore guest context. + */ +if (!vcpu-arch.shadow_dbg_reg.dbcr0) +return; + +/* Load Guest Context */ +mtspr(SPRN_DBCR1, vcpu-arch.shadow_dbg_reg.dbcr1); +mtspr(SPRN_DBCR2, vcpu-arch.shadow_dbg_reg.dbcr2); #ifdef +CONFIG_KVM_E500MC +mtspr(SPRN_DBCR4, vcpu-arch.shadow_dbg_reg.dbcr4); You need to make sure DBCR4 is 0 when you leave things back to normal user space. Otherwise guest debug can interfere with host debug. ok +#endif +mtspr(SPRN_IAC1, vcpu-arch.shadow_dbg_reg.iac[0]); +mtspr(SPRN_IAC2, vcpu-arch.shadow_dbg_reg.iac[1]); +#if CONFIG_PPC_ADV_DEBUG_IACS 2 +mtspr(SPRN_IAC3, vcpu-arch.shadow_dbg_reg.iac[2]); +mtspr(SPRN_IAC4, vcpu-arch.shadow_dbg_reg.iac[3]); +#endif +mtspr(SPRN_DAC1, vcpu-arch.shadow_dbg_reg.dac[0]); +mtspr(SPRN_DAC2, vcpu-arch.shadow_dbg_reg.dac[1]); + +/* Enable debug events after other debug registers restored */ +mtspr(SPRN_DBCR0, vcpu-arch.shadow_dbg_reg.dbcr0); } All of the code above looks suspiciously similar to prime_debug_regs();. Can't we somehow reuse that? I think we can if - Save thread-debug_regs in local data structure Yes, it can even be on the stack. - Load vcpu-arch-debug_regs in thread-debug_regs - Call prime_debug_regs(); - Restore thread-debug_regs from local save values in first step On heavyweight exit, based on the values on stack, yes. This is how I think we can save/restore debug context. Please correct if I am missing something. Sounds about right :) Actually, what happens if a guest breakpoint is set to a kernel address that happens to be within the scope of kvm code? You mean address of kvm code in guest or host? If host, we already mentioned that we do not support that. Right? -Bharat We do accept debug events between vcpu_run and the assembly code, right? Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 7/7 v3] KVM: PPC: Add userspace debug stub support
+static void kvmppc_booke_vcpu_load_debug_regs(struct kvm_vcpu +*vcpu) { + if (!vcpu-arch.debug_active) + return; + + /* Disable all debug events and clead pending debug events */ + mtspr(SPRN_DBCR0, 0x0); + kvmppc_clear_dbsr(); + + /* + * Check whether guest still need debug resource, if not then there + * is no need to restore guest context. + */ + if (!vcpu-arch.shadow_dbg_reg.dbcr0) + return; + + /* Load Guest Context */ + mtspr(SPRN_DBCR1, vcpu-arch.shadow_dbg_reg.dbcr1); + mtspr(SPRN_DBCR2, vcpu-arch.shadow_dbg_reg.dbcr2); #ifdef +CONFIG_KVM_E500MC + mtspr(SPRN_DBCR4, vcpu-arch.shadow_dbg_reg.dbcr4); You need to make sure DBCR4 is 0 when you leave things back to normal user space. Otherwise guest debug can interfere with host debug. ok +#endif + mtspr(SPRN_IAC1, vcpu-arch.shadow_dbg_reg.iac[0]); + mtspr(SPRN_IAC2, vcpu-arch.shadow_dbg_reg.iac[1]); +#if CONFIG_PPC_ADV_DEBUG_IACS 2 + mtspr(SPRN_IAC3, vcpu-arch.shadow_dbg_reg.iac[2]); + mtspr(SPRN_IAC4, vcpu-arch.shadow_dbg_reg.iac[3]); +#endif + mtspr(SPRN_DAC1, vcpu-arch.shadow_dbg_reg.dac[0]); + mtspr(SPRN_DAC2, vcpu-arch.shadow_dbg_reg.dac[1]); + + /* Enable debug events after other debug registers restored */ + mtspr(SPRN_DBCR0, vcpu-arch.shadow_dbg_reg.dbcr0); } All of the code above looks suspiciously similar to prime_debug_regs();. Can't we somehow reuse that? I think we can if - Save thread-debug_regs in local data structure Yes, it can even be on the stack. - Load vcpu-arch-debug_regs in thread-debug_regs - Call prime_debug_regs(); - Restore thread-debug_regs from local save values in first step On heavyweight exit, based on the values on stack, yes. This is how I think we can save/restore debug context. Please correct if I am missing something. 1) When QEMU is running - thread-debug_reg == QEMU debug register context. - Kernel will handle switching the debug register on context switch. - no vcpu_load() called 2) QEMU makes ioctls (except RUN) - This will call vcpu_load() - should not change context. - Some ioctls can change vcpu debug register, context saved in vcpu-debug_regs 3) QEMU Makes RUN ioctl - Save thread-debug_reg on STACK - Store thread-debug_reg == vcpu-debug_reg - load thread-debug_reg - RUN VCPU ( So thread points to vcpu context ) 4) Context switch happens When VCPU running - makes vcpu_load() should not load any context - kernel loads the vcpu context as thread-debug_regs points to vcpu context. 5) On heavyweight_exit - Load the context saved on stack in thread-debug_reg Thanks -Bharat -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 7/7 v3] KVM: PPC: Add userspace debug stub support
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Friday, May 03, 2013 6:00 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421 Subject: Re: [PATCH 7/7 v3] KVM: PPC: Add userspace debug stub support On 03.05.2013, at 13:08, Alexander Graf wrote: Am 03.05.2013 um 12:48 schrieb Bhushan Bharat-R65777 r65...@freescale.com: +static void kvmppc_booke_vcpu_load_debug_regs(struct kvm_vcpu +*vcpu) { +if (!vcpu-arch.debug_active) +return; + +/* Disable all debug events and clead pending debug events */ +mtspr(SPRN_DBCR0, 0x0); +kvmppc_clear_dbsr(); + +/* + * Check whether guest still need debug resource, if not then there + * is no need to restore guest context. + */ +if (!vcpu-arch.shadow_dbg_reg.dbcr0) +return; + +/* Load Guest Context */ +mtspr(SPRN_DBCR1, vcpu-arch.shadow_dbg_reg.dbcr1); +mtspr(SPRN_DBCR2, vcpu-arch.shadow_dbg_reg.dbcr2); #ifdef +CONFIG_KVM_E500MC +mtspr(SPRN_DBCR4, vcpu-arch.shadow_dbg_reg.dbcr4); You need to make sure DBCR4 is 0 when you leave things back to normal user space. Otherwise guest debug can interfere with host debug. ok +#endif +mtspr(SPRN_IAC1, vcpu-arch.shadow_dbg_reg.iac[0]); +mtspr(SPRN_IAC2, vcpu-arch.shadow_dbg_reg.iac[1]); +#if CONFIG_PPC_ADV_DEBUG_IACS 2 +mtspr(SPRN_IAC3, vcpu-arch.shadow_dbg_reg.iac[2]); +mtspr(SPRN_IAC4, vcpu-arch.shadow_dbg_reg.iac[3]); +#endif +mtspr(SPRN_DAC1, vcpu-arch.shadow_dbg_reg.dac[0]); +mtspr(SPRN_DAC2, vcpu-arch.shadow_dbg_reg.dac[1]); + +/* Enable debug events after other debug registers restored */ +mtspr(SPRN_DBCR0, vcpu-arch.shadow_dbg_reg.dbcr0); } All of the code above looks suspiciously similar to prime_debug_regs();. Can't we somehow reuse that? I think we can if - Save thread-debug_regs in local data structure Yes, it can even be on the stack. - Load vcpu-arch-debug_regs in thread-debug_regs - Call prime_debug_regs(); - Restore thread-debug_regs from local save values in first step On heavyweight exit, based on the values on stack, yes. This is how I think we can save/restore debug context. Please correct if I am missing something. Sounds about right :) Actually, what happens if a guest breakpoint is set to a kernel address that happens to be within the scope of kvm code? You mean address of kvm code in guest or host? If host, we already mentioned that we do not support that. Right? -Bharat We do accept debug events between vcpu_run and the assembly code, right? Alex -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 7/7 v3] KVM: PPC: Add userspace debug stub support
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Friday, April 26, 2013 4:46 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 7/7 v3] KVM: PPC: Add userspace debug stub support On 08.04.2013, at 12:32, Bharat Bhushan wrote: From: Bharat Bhushan bharat.bhus...@freescale.com This patch adds the debug stub support on booke/bookehv. Now QEMU debug stub can use hw breakpoint, watchpoint and software breakpoint to debug guest. Debug registers are saved/restored on vcpu_put()/vcpu_get(). Also the debug registers are saved restored only if guest is using debug resources. Currently we do not support debug resource emulation to guest, so always exit to user space irrespective of user space is expecting the debug exception or not. This is unexpected event and let us leave the action on user space. This is similar to what it was before, only thing is that now we have proper exit state available to user space. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/include/asm/kvm_host.h |8 + arch/powerpc/include/uapi/asm/kvm.h | 22 +++- arch/powerpc/kvm/booke.c| 242 --- arch/powerpc/kvm/booke.h|5 + 4 files changed, 255 insertions(+), 22 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index e34f8fe..b9ad20f 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -505,7 +505,15 @@ struct kvm_vcpu_arch { u32 mmucfg; u32 epr; u32 crit_save; + + /* Flag indicating that debug registers are used by guest */ + bool debug_active; + /* for save/restore thread-dbcr0 on vcpu run/heavyweight_exit */ + u32 saved_dbcr0; + /* guest debug registers*/ struct kvmppc_booke_debug_reg dbg_reg; + /* shadow debug registers */ + struct kvmppc_booke_debug_reg shadow_dbg_reg; #endif gpa_t paddr_accessed; gva_t vaddr_accessed; diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h index c0c38ed..d7ce449 100644 --- a/arch/powerpc/include/uapi/asm/kvm.h +++ b/arch/powerpc/include/uapi/asm/kvm.h @@ -25,6 +25,7 @@ /* Select powerpc specific features in linux/kvm.h */ #define __KVM_HAVE_SPAPR_TCE #define __KVM_HAVE_PPC_SMT +#define __KVM_HAVE_GUEST_DEBUG struct kvm_regs { __u64 pc; @@ -267,7 +268,24 @@ struct kvm_fpu { __u64 fpr[32]; }; +/* + * Defines for h/w breakpoint, watchpoint (read, write or both) and + * software breakpoint. + * These are used as type in KVM_SET_GUEST_DEBUG ioctl and status + * for KVM_DEBUG_EXIT. + */ +#define KVMPPC_DEBUG_NONE 0x0 +#define KVMPPC_DEBUG_BREAKPOINT(1UL 1) +#define KVMPPC_DEBUG_WATCH_WRITE (1UL 2) +#define KVMPPC_DEBUG_WATCH_READ(1UL 3) struct kvm_debug_exit_arch { + __u64 address; + /* +* exiting to userspace because of h/w breakpoint, watchpoint +* (read, write or both) and software breakpoint. +*/ + __u32 status; + __u32 reserved; }; /* for KVM_SET_GUEST_DEBUG */ @@ -279,10 +297,6 @@ struct kvm_guest_debug_arch { * Type denotes h/w breakpoint, read watchpoint, write * watchpoint or watchpoint (both read and write). */ -#define KVMPPC_DEBUG_NONE 0x0 -#define KVMPPC_DEBUG_BREAKPOINT(1UL 1) -#define KVMPPC_DEBUG_WATCH_WRITE (1UL 2) -#define KVMPPC_DEBUG_WATCH_READ(1UL 3) __u32 type; __u32 reserved; } bp[16]; diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 97ae158..0e93416 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -133,6 +133,29 @@ static void kvmppc_vcpu_sync_fpu(struct kvm_vcpu *vcpu) #endif } +static void kvmppc_vcpu_sync_debug(struct kvm_vcpu *vcpu) +{ + /* Synchronize guest's desire to get debug interrupts into shadow MSR */ +#ifndef CONFIG_KVM_BOOKE_HV + vcpu-arch.shadow_msr = ~MSR_DE; + vcpu-arch.shadow_msr |= vcpu-arch.shared-msr MSR_DE; +#endif + + /* Force enable debug interrupts when user space wants to debug */ + if (vcpu-guest_debug) { +#ifdef CONFIG_KVM_BOOKE_HV + /* +* Since there is no shadow MSR, sync MSR_DE into the guest +* visible MSR. +*/ + vcpu-arch.shared-msr |= MSR_DE; +#else + vcpu-arch.shadow_msr |= MSR_DE; + vcpu-arch.shared-msr = ~MSR_DE; +#endif + } +} + /* * Helper function for full MSR writes. No need to call this if only * EE/CE/ME/DE/RI are changing. @@ -150,6 +173,7 @@ void kvmppc_set_msr(struct kvm_vcpu *vcpu, u32 new_msr
RE: [PATCH 7/7 v3] KVM: PPC: Add userspace debug stub support
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Thursday, May 02, 2013 4:35 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421 Subject: Re: [PATCH 7/7 v3] KVM: PPC: Add userspace debug stub support On 02.05.2013, at 11:46, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Friday, April 26, 2013 4:46 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 7/7 v3] KVM: PPC: Add userspace debug stub support On 08.04.2013, at 12:32, Bharat Bhushan wrote: From: Bharat Bhushan bharat.bhus...@freescale.com This patch adds the debug stub support on booke/bookehv. Now QEMU debug stub can use hw breakpoint, watchpoint and software breakpoint to debug guest. Debug registers are saved/restored on vcpu_put()/vcpu_get(). Also the debug registers are saved restored only if guest is using debug resources. Currently we do not support debug resource emulation to guest, so always exit to user space irrespective of user space is expecting the debug exception or not. This is unexpected event and let us leave the action on user space. This is similar to what it was before, only thing is that now we have proper exit state available to user space. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/include/asm/kvm_host.h |8 + arch/powerpc/include/uapi/asm/kvm.h | 22 +++- arch/powerpc/kvm/booke.c| 242 - -- arch/powerpc/kvm/booke.h|5 + 4 files changed, 255 insertions(+), 22 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index e34f8fe..b9ad20f 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -505,7 +505,15 @@ struct kvm_vcpu_arch { u32 mmucfg; u32 epr; u32 crit_save; + + /* Flag indicating that debug registers are used by guest */ + bool debug_active; + /* for save/restore thread-dbcr0 on vcpu run/heavyweight_exit */ + u32 saved_dbcr0; + /* guest debug registers*/ struct kvmppc_booke_debug_reg dbg_reg; + /* shadow debug registers */ + struct kvmppc_booke_debug_reg shadow_dbg_reg; #endif gpa_t paddr_accessed; gva_t vaddr_accessed; diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h index c0c38ed..d7ce449 100644 --- a/arch/powerpc/include/uapi/asm/kvm.h +++ b/arch/powerpc/include/uapi/asm/kvm.h @@ -25,6 +25,7 @@ /* Select powerpc specific features in linux/kvm.h */ #define __KVM_HAVE_SPAPR_TCE #define __KVM_HAVE_PPC_SMT +#define __KVM_HAVE_GUEST_DEBUG struct kvm_regs { __u64 pc; @@ -267,7 +268,24 @@ struct kvm_fpu { __u64 fpr[32]; }; +/* + * Defines for h/w breakpoint, watchpoint (read, write or both) and + * software breakpoint. + * These are used as type in KVM_SET_GUEST_DEBUG ioctl and status + * for KVM_DEBUG_EXIT. + */ +#define KVMPPC_DEBUG_NONE0x0 +#define KVMPPC_DEBUG_BREAKPOINT (1UL 1) +#define KVMPPC_DEBUG_WATCH_WRITE (1UL 2) +#define KVMPPC_DEBUG_WATCH_READ (1UL 3) struct kvm_debug_exit_arch { + __u64 address; + /* + * exiting to userspace because of h/w breakpoint, watchpoint + * (read, write or both) and software breakpoint. + */ + __u32 status; + __u32 reserved; }; /* for KVM_SET_GUEST_DEBUG */ @@ -279,10 +297,6 @@ struct kvm_guest_debug_arch { * Type denotes h/w breakpoint, read watchpoint, write * watchpoint or watchpoint (both read and write). */ -#define KVMPPC_DEBUG_NONE0x0 -#define KVMPPC_DEBUG_BREAKPOINT (1UL 1) -#define KVMPPC_DEBUG_WATCH_WRITE (1UL 2) -#define KVMPPC_DEBUG_WATCH_READ (1UL 3) __u32 type; __u32 reserved; } bp[16]; diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 97ae158..0e93416 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -133,6 +133,29 @@ static void kvmppc_vcpu_sync_fpu(struct kvm_vcpu *vcpu) #endif } +static void kvmppc_vcpu_sync_debug(struct kvm_vcpu *vcpu) { + /* Synchronize guest's desire to get debug interrupts into shadow +MSR */ #ifndef CONFIG_KVM_BOOKE_HV + vcpu-arch.shadow_msr = ~MSR_DE; + vcpu-arch.shadow_msr |= vcpu-arch.shared-msr MSR_DE; #endif + + /* Force enable debug interrupts when user space wants to debug */ + if (vcpu-guest_debug) { +#ifdef CONFIG_KVM_BOOKE_HV + /* + * Since there is no shadow MSR, sync MSR_DE into the guest + * visible MSR. + */ + vcpu-arch.shared-msr |= MSR_DE; #else + vcpu-arch.shadow_msr |= MSR_DE
RE: [PATCH 7/7 v3] KVM: PPC: Add userspace debug stub support
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Friday, April 26, 2013 4:46 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 7/7 v3] KVM: PPC: Add userspace debug stub support On 08.04.2013, at 12:32, Bharat Bhushan wrote: From: Bharat Bhushan bharat.bhus...@freescale.com This patch adds the debug stub support on booke/bookehv. Now QEMU debug stub can use hw breakpoint, watchpoint and software breakpoint to debug guest. Debug registers are saved/restored on vcpu_put()/vcpu_get(). Also the debug registers are saved restored only if guest is using debug resources. Currently we do not support debug resource emulation to guest, so always exit to user space irrespective of user space is expecting the debug exception or not. This is unexpected event and let us leave the action on user space. This is similar to what it was before, only thing is that now we have proper exit state available to user space. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/include/asm/kvm_host.h |8 + arch/powerpc/include/uapi/asm/kvm.h | 22 +++- arch/powerpc/kvm/booke.c| 242 --- arch/powerpc/kvm/booke.h|5 + 4 files changed, 255 insertions(+), 22 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index e34f8fe..b9ad20f 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -505,7 +505,15 @@ struct kvm_vcpu_arch { u32 mmucfg; u32 epr; u32 crit_save; + + /* Flag indicating that debug registers are used by guest */ + bool debug_active; + /* for save/restore thread-dbcr0 on vcpu run/heavyweight_exit */ + u32 saved_dbcr0; + /* guest debug registers*/ struct kvmppc_booke_debug_reg dbg_reg; + /* shadow debug registers */ + struct kvmppc_booke_debug_reg shadow_dbg_reg; #endif gpa_t paddr_accessed; gva_t vaddr_accessed; diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h index c0c38ed..d7ce449 100644 --- a/arch/powerpc/include/uapi/asm/kvm.h +++ b/arch/powerpc/include/uapi/asm/kvm.h @@ -25,6 +25,7 @@ /* Select powerpc specific features in linux/kvm.h */ #define __KVM_HAVE_SPAPR_TCE #define __KVM_HAVE_PPC_SMT +#define __KVM_HAVE_GUEST_DEBUG struct kvm_regs { __u64 pc; @@ -267,7 +268,24 @@ struct kvm_fpu { __u64 fpr[32]; }; +/* + * Defines for h/w breakpoint, watchpoint (read, write or both) and + * software breakpoint. + * These are used as type in KVM_SET_GUEST_DEBUG ioctl and status + * for KVM_DEBUG_EXIT. + */ +#define KVMPPC_DEBUG_NONE 0x0 +#define KVMPPC_DEBUG_BREAKPOINT(1UL 1) +#define KVMPPC_DEBUG_WATCH_WRITE (1UL 2) +#define KVMPPC_DEBUG_WATCH_READ(1UL 3) struct kvm_debug_exit_arch { + __u64 address; + /* +* exiting to userspace because of h/w breakpoint, watchpoint +* (read, write or both) and software breakpoint. +*/ + __u32 status; + __u32 reserved; }; /* for KVM_SET_GUEST_DEBUG */ @@ -279,10 +297,6 @@ struct kvm_guest_debug_arch { * Type denotes h/w breakpoint, read watchpoint, write * watchpoint or watchpoint (both read and write). */ -#define KVMPPC_DEBUG_NONE 0x0 -#define KVMPPC_DEBUG_BREAKPOINT(1UL 1) -#define KVMPPC_DEBUG_WATCH_WRITE (1UL 2) -#define KVMPPC_DEBUG_WATCH_READ(1UL 3) __u32 type; __u32 reserved; } bp[16]; diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 97ae158..0e93416 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -133,6 +133,29 @@ static void kvmppc_vcpu_sync_fpu(struct kvm_vcpu *vcpu) #endif } +static void kvmppc_vcpu_sync_debug(struct kvm_vcpu *vcpu) +{ + /* Synchronize guest's desire to get debug interrupts into shadow MSR */ +#ifndef CONFIG_KVM_BOOKE_HV + vcpu-arch.shadow_msr = ~MSR_DE; + vcpu-arch.shadow_msr |= vcpu-arch.shared-msr MSR_DE; +#endif + + /* Force enable debug interrupts when user space wants to debug */ + if (vcpu-guest_debug) { +#ifdef CONFIG_KVM_BOOKE_HV + /* +* Since there is no shadow MSR, sync MSR_DE into the guest +* visible MSR. +*/ + vcpu-arch.shared-msr |= MSR_DE; +#else + vcpu-arch.shadow_msr |= MSR_DE; + vcpu-arch.shared-msr = ~MSR_DE; +#endif + } +} + /* * Helper function for full MSR writes. No need to call this if only * EE/CE/ME/DE/RI are changing. @@ -150,6 +173,7 @@ void kvmppc_set_msr(struct kvm_vcpu *vcpu, u32 new_msr
RE: [PATCH 7/7 v3] KVM: PPC: Add userspace debug stub support
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Thursday, May 02, 2013 4:35 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421 Subject: Re: [PATCH 7/7 v3] KVM: PPC: Add userspace debug stub support On 02.05.2013, at 11:46, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Friday, April 26, 2013 4:46 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 7/7 v3] KVM: PPC: Add userspace debug stub support On 08.04.2013, at 12:32, Bharat Bhushan wrote: From: Bharat Bhushan bharat.bhus...@freescale.com This patch adds the debug stub support on booke/bookehv. Now QEMU debug stub can use hw breakpoint, watchpoint and software breakpoint to debug guest. Debug registers are saved/restored on vcpu_put()/vcpu_get(). Also the debug registers are saved restored only if guest is using debug resources. Currently we do not support debug resource emulation to guest, so always exit to user space irrespective of user space is expecting the debug exception or not. This is unexpected event and let us leave the action on user space. This is similar to what it was before, only thing is that now we have proper exit state available to user space. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/include/asm/kvm_host.h |8 + arch/powerpc/include/uapi/asm/kvm.h | 22 +++- arch/powerpc/kvm/booke.c| 242 - -- arch/powerpc/kvm/booke.h|5 + 4 files changed, 255 insertions(+), 22 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index e34f8fe..b9ad20f 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -505,7 +505,15 @@ struct kvm_vcpu_arch { u32 mmucfg; u32 epr; u32 crit_save; + + /* Flag indicating that debug registers are used by guest */ + bool debug_active; + /* for save/restore thread-dbcr0 on vcpu run/heavyweight_exit */ + u32 saved_dbcr0; + /* guest debug registers*/ struct kvmppc_booke_debug_reg dbg_reg; + /* shadow debug registers */ + struct kvmppc_booke_debug_reg shadow_dbg_reg; #endif gpa_t paddr_accessed; gva_t vaddr_accessed; diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h index c0c38ed..d7ce449 100644 --- a/arch/powerpc/include/uapi/asm/kvm.h +++ b/arch/powerpc/include/uapi/asm/kvm.h @@ -25,6 +25,7 @@ /* Select powerpc specific features in linux/kvm.h */ #define __KVM_HAVE_SPAPR_TCE #define __KVM_HAVE_PPC_SMT +#define __KVM_HAVE_GUEST_DEBUG struct kvm_regs { __u64 pc; @@ -267,7 +268,24 @@ struct kvm_fpu { __u64 fpr[32]; }; +/* + * Defines for h/w breakpoint, watchpoint (read, write or both) and + * software breakpoint. + * These are used as type in KVM_SET_GUEST_DEBUG ioctl and status + * for KVM_DEBUG_EXIT. + */ +#define KVMPPC_DEBUG_NONE0x0 +#define KVMPPC_DEBUG_BREAKPOINT (1UL 1) +#define KVMPPC_DEBUG_WATCH_WRITE (1UL 2) +#define KVMPPC_DEBUG_WATCH_READ (1UL 3) struct kvm_debug_exit_arch { + __u64 address; + /* + * exiting to userspace because of h/w breakpoint, watchpoint + * (read, write or both) and software breakpoint. + */ + __u32 status; + __u32 reserved; }; /* for KVM_SET_GUEST_DEBUG */ @@ -279,10 +297,6 @@ struct kvm_guest_debug_arch { * Type denotes h/w breakpoint, read watchpoint, write * watchpoint or watchpoint (both read and write). */ -#define KVMPPC_DEBUG_NONE0x0 -#define KVMPPC_DEBUG_BREAKPOINT (1UL 1) -#define KVMPPC_DEBUG_WATCH_WRITE (1UL 2) -#define KVMPPC_DEBUG_WATCH_READ (1UL 3) __u32 type; __u32 reserved; } bp[16]; diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 97ae158..0e93416 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -133,6 +133,29 @@ static void kvmppc_vcpu_sync_fpu(struct kvm_vcpu *vcpu) #endif } +static void kvmppc_vcpu_sync_debug(struct kvm_vcpu *vcpu) { + /* Synchronize guest's desire to get debug interrupts into shadow +MSR */ #ifndef CONFIG_KVM_BOOKE_HV + vcpu-arch.shadow_msr = ~MSR_DE; + vcpu-arch.shadow_msr |= vcpu-arch.shared-msr MSR_DE; #endif + + /* Force enable debug interrupts when user space wants to debug */ + if (vcpu-guest_debug) { +#ifdef CONFIG_KVM_BOOKE_HV + /* + * Since there is no shadow MSR, sync MSR_DE into the guest + * visible MSR. + */ + vcpu-arch.shared-msr |= MSR_DE; #else + vcpu-arch.shadow_msr |= MSR_DE
RE: [PATCH] ppc: initialize GPRs as per epapr
This was supposed to go to qemu-devel. Please Ignore this patch: Thanks -Bharat -Original Message- From: Bhushan Bharat-R65777 Sent: Friday, April 26, 2013 11:44 AM To: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Cc: Bhushan Bharat-R65777; Bhushan Bharat-R65777; Yoder Stuart-B08248 Subject: [PATCH] ppc: initialize GPRs as per epapr ePAPR defines the initial values of cpu registers. This patch initialize the GPRs as per ePAPR specification. This resolves the issue of guest reboot/reset (guest hang on reboot). Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com Signed-off-by: Stuart Yoder stuart.yo...@freescale.com --- hw/ppc/e500.c |7 +++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/hw/ppc/e500.c b/hw/ppc/e500.c index c1bdb6b..a47f976 100644 --- a/hw/ppc/e500.c +++ b/hw/ppc/e500.c @@ -37,6 +37,7 @@ #include qemu/host-utils.h #include hw/pci-host/ppce500.h +#define EPAPR_MAGIC(0x45504150) #define BINARY_DEVICE_TREE_FILEmpc8544ds.dtb #define UIMAGE_LOAD_BASE 0 #define DTC_LOAD_PAD 0x180 @@ -444,6 +445,12 @@ static void ppce500_cpu_reset(void *opaque) cs-halted = 0; env-gpr[1] = (1620) - 8; env-gpr[3] = bi-dt_base; +env-gpr[4] = 0; +env-gpr[5] = 0; +env-gpr[6] = EPAPR_MAGIC; +env-gpr[7] = (64 * 1024 * 1024); +env-gpr[8] = 0; +env-gpr[9] = 0; env-nip = bi-entry; mmubooke_create_initial_mapping(env); } -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] ppc: initialize GPRs as per epapr
This was supposed to go to qemu-devel. Please Ignore this patch: Thanks -Bharat -Original Message- From: Bhushan Bharat-R65777 Sent: Friday, April 26, 2013 11:44 AM To: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Cc: Bhushan Bharat-R65777; Bhushan Bharat-R65777; Yoder Stuart-B08248 Subject: [PATCH] ppc: initialize GPRs as per epapr ePAPR defines the initial values of cpu registers. This patch initialize the GPRs as per ePAPR specification. This resolves the issue of guest reboot/reset (guest hang on reboot). Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com Signed-off-by: Stuart Yoder stuart.yo...@freescale.com --- hw/ppc/e500.c |7 +++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/hw/ppc/e500.c b/hw/ppc/e500.c index c1bdb6b..a47f976 100644 --- a/hw/ppc/e500.c +++ b/hw/ppc/e500.c @@ -37,6 +37,7 @@ #include qemu/host-utils.h #include hw/pci-host/ppce500.h +#define EPAPR_MAGIC(0x45504150) #define BINARY_DEVICE_TREE_FILEmpc8544ds.dtb #define UIMAGE_LOAD_BASE 0 #define DTC_LOAD_PAD 0x180 @@ -444,6 +445,12 @@ static void ppce500_cpu_reset(void *opaque) cs-halted = 0; env-gpr[1] = (1620) - 8; env-gpr[3] = bi-dt_base; +env-gpr[4] = 0; +env-gpr[5] = 0; +env-gpr[6] = EPAPR_MAGIC; +env-gpr[7] = (64 * 1024 * 1024); +env-gpr[8] = 0; +env-gpr[9] = 0; env-nip = bi-entry; mmubooke_create_initial_mapping(env); } -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] KVM : PPC : cache flush for kernel managed pages
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Thursday, April 25, 2013 8:36 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH] KVM : PPC : cache flush for kernel managed pages On 23.04.2013, at 08:39, Bharat Bhushan wrote: Kernel should only try flushing pages which are managed by kernel. pfn_to_page will returns junk struct page for pages not managed by kernel, so if kernel will try to flush direct mapped memory or direct assigned device mapping then it will work on junk struct page. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..e07da21 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -455,7 +455,8 @@ static inline int kvmppc_e500_shadow_map(struct kvmppc_vcpu_e500 *vcpu_e500, ref, gvaddr, stlbe); /* Clear i-cache for new pages */ - kvmppc_mmu_flush_icache(pfn); + if (pfn_valid(pfn)) + kvmppc_mmu_flush_icache(pfn); Could you please move the check into kvmppc_mmu_flush_icache()? That way we're guaranteed we can't screw up cache flushes ever :). Also, please add a comment saying why we need this. Ok -Bharat Alex /* Drop refcount on page, so that mmu notifiers can clear it */ kvm_release_pfn_clean(pfn); -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] KVM : PPC : cache flush for kernel managed pages
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Thursday, April 25, 2013 8:36 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH] KVM : PPC : cache flush for kernel managed pages On 23.04.2013, at 08:39, Bharat Bhushan wrote: Kernel should only try flushing pages which are managed by kernel. pfn_to_page will returns junk struct page for pages not managed by kernel, so if kernel will try to flush direct mapped memory or direct assigned device mapping then it will work on junk struct page. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..e07da21 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -455,7 +455,8 @@ static inline int kvmppc_e500_shadow_map(struct kvmppc_vcpu_e500 *vcpu_e500, ref, gvaddr, stlbe); /* Clear i-cache for new pages */ - kvmppc_mmu_flush_icache(pfn); + if (pfn_valid(pfn)) + kvmppc_mmu_flush_icache(pfn); Could you please move the check into kvmppc_mmu_flush_icache()? That way we're guaranteed we can't screw up cache flushes ever :). Also, please add a comment saying why we need this. Ok -Bharat Alex /* Drop refcount on page, so that mmu notifiers can clear it */ kvm_release_pfn_clean(pfn); -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] KVM/PPC: emulate ehpriv
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Friday, April 19, 2013 5:44 PM To: Tiejun Chen Cc: kvm@vger.kernel.org mailing list; kvm-...@vger.kernel.org; Bhushan Bharat- R65777 Subject: Re: [PATCH] KVM/PPC: emulate ehpriv On 19.04.2013, at 04:44, Tiejun Chen wrote: We can provide this emulation to simplify more extension later. Works for me, but this should really be part of a series that makes use of ehpriv. Alex, this already planned to be in my debug patches. I know you are busy and I am just waiting for other patches to be reviewed :) -Bharat Alex Signed-off-by: Tiejun Chen tiejun.c...@windriver.com --- arch/powerpc/include/asm/disassemble.h |4 arch/powerpc/kvm/e500_emulate.c| 17 + 2 files changed, 21 insertions(+) diff --git a/arch/powerpc/include/asm/disassemble.h b/arch/powerpc/include/asm/disassemble.h index 9b198d1..856f8de 100644 --- a/arch/powerpc/include/asm/disassemble.h +++ b/arch/powerpc/include/asm/disassemble.h @@ -77,4 +77,8 @@ static inline unsigned int get_d(u32 inst) return inst 0x; } +static inline unsigned int get_oc(u32 inst) { + return (inst 11) 0x7fff; +} #endif /* __ASM_PPC_DISASSEMBLE_H__ */ diff --git a/arch/powerpc/kvm/e500_emulate.c b/arch/powerpc/kvm/e500_emulate.c index e78f353..36492cf 100644 --- a/arch/powerpc/kvm/e500_emulate.c +++ b/arch/powerpc/kvm/e500_emulate.c @@ -26,6 +26,7 @@ #define XOP_TLBRE 946 #define XOP_TLBWE 978 #define XOP_TLBILX 18 +#define XOP_EHPRIV 270 #ifdef CONFIG_KVM_E500MC static int dbell2prio(ulong param) @@ -80,6 +81,18 @@ static int kvmppc_e500_emul_msgsnd(struct kvm_vcpu *vcpu, int rb) return EMULATE_DONE; } + +static int kvmppc_e500_emul_ehpriv(struct kvm_run *run, struct kvm_vcpu *vcpu, + unsigned int inst) +{ + int emulated = EMULATE_DONE; + + switch (get_oc(inst)) { + default: + emulated = EMULATE_FAIL; + } + return emulated; +} #endif int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu, @@ -130,6 +143,10 @@ int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu, emulated = kvmppc_e500_emul_tlbivax(vcpu, ea); break; + case XOP_EHPRIV: + emulated = kvmppc_e500_emul_ehpriv(run, vcpu, inst); + break; + default: emulated = EMULATE_FAIL; } -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] KVM/PPC: emulate ehpriv
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Friday, April 19, 2013 5:44 PM To: Tiejun Chen Cc: k...@vger.kernel.org mailing list; kvm-ppc@vger.kernel.org; Bhushan Bharat- R65777 Subject: Re: [PATCH] KVM/PPC: emulate ehpriv On 19.04.2013, at 04:44, Tiejun Chen wrote: We can provide this emulation to simplify more extension later. Works for me, but this should really be part of a series that makes use of ehpriv. Alex, this already planned to be in my debug patches. I know you are busy and I am just waiting for other patches to be reviewed :) -Bharat Alex Signed-off-by: Tiejun Chen tiejun.c...@windriver.com --- arch/powerpc/include/asm/disassemble.h |4 arch/powerpc/kvm/e500_emulate.c| 17 + 2 files changed, 21 insertions(+) diff --git a/arch/powerpc/include/asm/disassemble.h b/arch/powerpc/include/asm/disassemble.h index 9b198d1..856f8de 100644 --- a/arch/powerpc/include/asm/disassemble.h +++ b/arch/powerpc/include/asm/disassemble.h @@ -77,4 +77,8 @@ static inline unsigned int get_d(u32 inst) return inst 0x; } +static inline unsigned int get_oc(u32 inst) { + return (inst 11) 0x7fff; +} #endif /* __ASM_PPC_DISASSEMBLE_H__ */ diff --git a/arch/powerpc/kvm/e500_emulate.c b/arch/powerpc/kvm/e500_emulate.c index e78f353..36492cf 100644 --- a/arch/powerpc/kvm/e500_emulate.c +++ b/arch/powerpc/kvm/e500_emulate.c @@ -26,6 +26,7 @@ #define XOP_TLBRE 946 #define XOP_TLBWE 978 #define XOP_TLBILX 18 +#define XOP_EHPRIV 270 #ifdef CONFIG_KVM_E500MC static int dbell2prio(ulong param) @@ -80,6 +81,18 @@ static int kvmppc_e500_emul_msgsnd(struct kvm_vcpu *vcpu, int rb) return EMULATE_DONE; } + +static int kvmppc_e500_emul_ehpriv(struct kvm_run *run, struct kvm_vcpu *vcpu, + unsigned int inst) +{ + int emulated = EMULATE_DONE; + + switch (get_oc(inst)) { + default: + emulated = EMULATE_FAIL; + } + return emulated; +} #endif int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu, @@ -130,6 +143,10 @@ int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu, emulated = kvmppc_e500_emul_tlbivax(vcpu, ea); break; + case XOP_EHPRIV: + emulated = kvmppc_e500_emul_ehpriv(run, vcpu, inst); + break; + default: emulated = EMULATE_FAIL; } -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: RFC: vfio API changes needed for powerpc (v3)
So now the sequence would be something like: 1)VFIO_GROUP_SET_CONTAINER // add groups to the container 2)VFIO_SET_IOMMU(VFIO_FSL_PAMU)// set iommu model 3)count = VFIO_IOMMU_GET_MSI_BANK_COUNT// returns max # of MSI banks 4)VFIO_IOMMU_SET_ATTR(ATTR_GEOMETRY) // set overall aperture 5)VFIO_IOMMU_SET_ATTR(ATTR_WINDOWS) // set # of windows, including MSI banks 6) For (int I = 0; I count; i++) VFIO_IOMMU_PAMU_MAP_MSI_BANK() // map the MSI banks, do not enable aperture here. 7) Memory Listener will call- VFIO_IOMMU_MAP_DMA// map the guest's memory --- kernel enables aperture here on first VFIO_IOMMU_MAP_DMA 8)VFIO_DEVICE_SET_IRQS --- VFIO in kernel makes pci_enable_msix()/pci_enable_msi_block() calls, this sets actual MSI addr/data in physical device. --- As the address set by above APIs is not what we want so - is using MSIX, VFIO will update address in the MSI-X table - if using MSI, update MSI address in PCI configuration space. Thanks -Bharat -Original Message- From: Yoder Stuart-B08248 Sent: Friday, April 05, 2013 3:40 AM To: Alex Williamson Cc: Wood Scott-B07421; ag...@suse.de; Bhushan Bharat-R65777; Sethi Varun-B16395; kvm@vger.kernel.org; qemu-de...@nongnu.org; io...@lists.linux-foundation.org Subject: RFC: vfio API changes needed for powerpc (v3) -v3 updates -made vfio_pamu_attr a union, added flags -s/VFIO_PAMU_/VFIO_IOMMU_PAMU_/ for the ioctls to make it more clear which fd is being operated on -added flags to vfio_pamu_msi_bank_map/umap -VFIO_PAMU_GET_MSI_BANK_COUNT now just returns a __u32 not a struct -fixed some typos The Freescale PAMU is an aperture-based IOMMU with the following characteristics. Each device has an entry in a table in memory describing the iova-phys mapping. The mapping has: -an overall aperture that is power of 2 sized, and has a start iova that is naturally aligned -has 1 or more windows within the aperture -number of windows must be power of 2, max is 256 -size of each window is determined by aperture size / # of windows -iova of each window is determined by aperture start iova / # of windows -the mapped region in each window can be different than the window size...mapping must power of 2 -physical address of the mapping must be naturally aligned with the mapping size These ioctls operate on the VFIO file descriptor (/dev/vfio/vfio). /* * VFIO_IOMMU_PAMU_GET_ATTR * * Gets the iommu attributes for the current vfio container. This * ioctl is applicable to an iommu type of VFIO_PAMU only. * Caller sets argsz and attribute. The ioctl fills in * the provided struct vfio_pamu_attr based on the attribute * value that was set. * Return: 0 on success, -errno on failure */ struct vfio_pamu_attr { __u32 argsz; __u32 flags;/* no flags currently */ __u32 attribute; union { /* VFIO_ATTR_GEOMETRY */ struct { __u64 aperture_start; /* first addr that can be mapped */ __u64 aperture_end; /* last addr that can be mapped */ } attr; /* VFIO_ATTR_WINDOWS */ __u32 windows; /* number of windows in the aperture */ /* initially this will be the max number * of windows that can be set */ /* VFIO_ATTR_PAMU_STASH */ struct { __u32 cpu; /* CPU number for stashing */ __u32 cache; /* cache ID for stashing */ } stash; } }; #define VFIO_IOMMU_PAMU_GET_ATTR _IO(VFIO_TYPE, VFIO_BASE + x, struct vfio_pamu_attr) /* * VFIO_IOMMU_PAMU_SET_ATTR * * Sets the iommu attributes for the current vfio container. This * ioctl is applicable to an iommu type of VFIO_PAMU only. * Caller sets struct vfio_pamu attr, including argsz and attribute and * setting any fields that are valid for the attribute. * Return: 0 on success, -errno on failure */ #define VFIO_IOMMU_PAMU_SET_ATTR _IO(VFIO_TYPE, VFIO_BASE + x, struct vfio_pamu_attr) /* * VFIO_IOMMU_PAMU_GET_MSI_BANK_COUNT * * Returns the number of MSI banks for this platform. This tells user space * how many aperture windows should be reserved for MSI banks when setting * the PAMU geometry and window count. * Return: __u32 bank count on success, -errno on failure */ #define VFIO_IOMMU_PAMU_GET_MSI_BANK_COUNT _IO(VFIO_TYPE, VFIO_BASE + x, __u32) /* * VFIO_IOMMU_PAMU_MAP_MSI_BANK
RE: [PATCH] bookehv: Handle debug exception on guest exit
Hi Kumar/Benh, After further looking into the code I think that if we correct the vector range below in DebugDebug handler then we do not need the change I provided in this patch. Here is the snapshot for 32 bit (head_booke.h, same will be true for 64 bit): #define DEBUG_DEBUG_EXCEPTION \ START_EXCEPTION(DebugDebug); \ DEBUG_EXCEPTION_PROLOG; \ \ /*\ * If there is a single step or branch-taken exception in an \ * exception entry sequence, it was probably meant to apply to\ * the code where the exception occurred (since exception entry \ * doesn't turn off DE automatically). We simulate the effect\ * of turning off DE on entry to an exception handler by turning \ * off DE in the DSRR1 value and clearing the debug status. \ */ \ mfspr r10,SPRN_DBSR; /* check single-step/branch taken */ \ andis. r10,r10,(DBSR_IC|DBSR_BT)@h; \ beq+2f; \ \ lis r10,KERNELBASE@h; /* check if exception in vectors */ \ ori r10,r10,KERNELBASE@l; \ cmplw r12,r10; \ blt+2f; /* addr below exception vectors */\ \ lis r10,DebugDebug@h;\ ori r10,r10,DebugDebug@l; \ Here we assume all exception vector ends at DebugDebug, which is not correct. We probably should get proper end by using some start_vector and end_vector lebels or at least use end at Ehvpriv (which is last defined in head_fsl_booke.S for PowerPC. Is that correct? cmplw r12,r10; \ bgt+2f; /* addr above exception vectors */\ Thanks -Bharat -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Bhushan Bharat-R65777 Sent: Thursday, April 04, 2013 8:29 PM To: Alexander Graf Cc: linuxppc-...@lists.ozlabs.org; kvm@vger.kernel.org; kvm-...@vger.kernel.org; Wood Scott-B07421 Subject: RE: [PATCH] bookehv: Handle debug exception on guest exit -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Thursday, April 04, 2013 6:55 PM To: Bhushan Bharat-R65777 Cc: linuxppc-...@lists.ozlabs.org; kvm@vger.kernel.org; kvm-...@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH] bookehv: Handle debug exception on guest exit On 20.03.2013, at 18:45, Bharat Bhushan wrote: EPCR.DUVD controls whether the debug events can come in hypervisor mode or not. When KVM guest is using the debug resource then we do not want debug events to be captured in guest entry/exit path. So we set EPCR.DUVD when entering and clears EPCR.DUVD when exiting from guest. Debug instruction complete is a post-completion debug exception but debug event gets posted on the basis of MSR before the instruction is executed. Now if the instruction switches the context from guest mode (MSR.GS = 1) to hypervisor mode (MSR.GS = 0) then the xSRR0 points to first instruction of KVM handler and xSRR1 points that MSR.GS is clear (hypervisor context). Now as xSRR1.GS is used to decide whether KVM handler will be invoked to handle the exception or host host kernel debug handler will be invoked to handle the exception. This leads to host kernel debug handler handling the exception which should either be handled by KVM. This is tested on e500mc in 32 bit mode Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v0: - Do not apply this change for debug_crit as we do not know those chips have issue or not. - corrected 64bit case branching arch/powerpc/kernel/exceptions-64e.S | 29 - arch/powerpc/kernel/head_booke.h | 26 ++ 2 files changed, 54 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S index 4684e33..8b26294 100644 --- a/arch/powerpc/kernel/exceptions-64e.S +++ b/arch/powerpc/kernel
RE: [PATCH] bookehv: Handle debug exception on guest exit
Hi Kumar/Benh, After further looking into the code I think that if we correct the vector range below in DebugDebug handler then we do not need the change I provided in this patch. Here is the snapshot for 32 bit (head_booke.h, same will be true for 64 bit): #define DEBUG_DEBUG_EXCEPTION \ START_EXCEPTION(DebugDebug); \ DEBUG_EXCEPTION_PROLOG; \ \ /*\ * If there is a single step or branch-taken exception in an \ * exception entry sequence, it was probably meant to apply to\ * the code where the exception occurred (since exception entry \ * doesn't turn off DE automatically). We simulate the effect\ * of turning off DE on entry to an exception handler by turning \ * off DE in the DSRR1 value and clearing the debug status. \ */ \ mfspr r10,SPRN_DBSR; /* check single-step/branch taken */ \ andis. r10,r10,(DBSR_IC|DBSR_BT)@h; \ beq+2f; \ \ lis r10,KERNELBASE@h; /* check if exception in vectors */ \ ori r10,r10,KERNELBASE@l; \ cmplw r12,r10; \ blt+2f; /* addr below exception vectors */\ \ lis r10,DebugDebug@h;\ ori r10,r10,DebugDebug@l; \ Here we assume all exception vector ends at DebugDebug, which is not correct. We probably should get proper end by using some start_vector and end_vector lebels or at least use end at Ehvpriv (which is last defined in head_fsl_booke.S for PowerPC. Is that correct? cmplw r12,r10; \ bgt+2f; /* addr above exception vectors */\ Thanks -Bharat -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Bhushan Bharat-R65777 Sent: Thursday, April 04, 2013 8:29 PM To: Alexander Graf Cc: linuxppc-...@lists.ozlabs.org; k...@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421 Subject: RE: [PATCH] bookehv: Handle debug exception on guest exit -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Thursday, April 04, 2013 6:55 PM To: Bhushan Bharat-R65777 Cc: linuxppc-...@lists.ozlabs.org; k...@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH] bookehv: Handle debug exception on guest exit On 20.03.2013, at 18:45, Bharat Bhushan wrote: EPCR.DUVD controls whether the debug events can come in hypervisor mode or not. When KVM guest is using the debug resource then we do not want debug events to be captured in guest entry/exit path. So we set EPCR.DUVD when entering and clears EPCR.DUVD when exiting from guest. Debug instruction complete is a post-completion debug exception but debug event gets posted on the basis of MSR before the instruction is executed. Now if the instruction switches the context from guest mode (MSR.GS = 1) to hypervisor mode (MSR.GS = 0) then the xSRR0 points to first instruction of KVM handler and xSRR1 points that MSR.GS is clear (hypervisor context). Now as xSRR1.GS is used to decide whether KVM handler will be invoked to handle the exception or host host kernel debug handler will be invoked to handle the exception. This leads to host kernel debug handler handling the exception which should either be handled by KVM. This is tested on e500mc in 32 bit mode Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v0: - Do not apply this change for debug_crit as we do not know those chips have issue or not. - corrected 64bit case branching arch/powerpc/kernel/exceptions-64e.S | 29 - arch/powerpc/kernel/head_booke.h | 26 ++ 2 files changed, 54 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S index 4684e33..8b26294 100644 --- a/arch/powerpc/kernel/exceptions-64e.S +++ b/arch/powerpc/kernel
RE: [PATCH] bookehv: Handle debug exception on guest exit
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Thursday, April 04, 2013 6:55 PM To: Bhushan Bharat-R65777 Cc: linuxppc-...@lists.ozlabs.org; kvm@vger.kernel.org; kvm-...@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH] bookehv: Handle debug exception on guest exit On 20.03.2013, at 18:45, Bharat Bhushan wrote: EPCR.DUVD controls whether the debug events can come in hypervisor mode or not. When KVM guest is using the debug resource then we do not want debug events to be captured in guest entry/exit path. So we set EPCR.DUVD when entering and clears EPCR.DUVD when exiting from guest. Debug instruction complete is a post-completion debug exception but debug event gets posted on the basis of MSR before the instruction is executed. Now if the instruction switches the context from guest mode (MSR.GS = 1) to hypervisor mode (MSR.GS = 0) then the xSRR0 points to first instruction of KVM handler and xSRR1 points that MSR.GS is clear (hypervisor context). Now as xSRR1.GS is used to decide whether KVM handler will be invoked to handle the exception or host host kernel debug handler will be invoked to handle the exception. This leads to host kernel debug handler handling the exception which should either be handled by KVM. This is tested on e500mc in 32 bit mode Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v0: - Do not apply this change for debug_crit as we do not know those chips have issue or not. - corrected 64bit case branching arch/powerpc/kernel/exceptions-64e.S | 29 - arch/powerpc/kernel/head_booke.h | 26 ++ 2 files changed, 54 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S index 4684e33..8b26294 100644 --- a/arch/powerpc/kernel/exceptions-64e.S +++ b/arch/powerpc/kernel/exceptions-64e.S @@ -516,6 +516,33 @@ kernel_dbg_exc: andis. r15,r14,DBSR_IC@h beq+1f +#ifdef CONFIG_KVM_BOOKE_HV + /* +* EPCR.DUVD controls whether the debug events can come in +* hypervisor mode or not. When KVM guest is using the debug +* resource then we do not want debug events to be captured +* in guest entry/exit path. So we set EPCR.DUVD when entering +* and clears EPCR.DUVD when exiting from guest. +* Debug instruction complete is a post-completion debug +* exception but debug event gets posted on the basis of MSR +* before the instruction is executed. Now if the instruction +* switches the context from guest mode (MSR.GS = 1) to hypervisor +* mode (MSR.GS = 0) then the xSRR0 points to first instruction of Can't we just execute that code path with MSR.DE=0? Single stepping uses DBCR0.IC (instruction complete). Can you describe how MSR.DE = 0 will work? Alex +* KVM handler and xSRR1 points that MSR.GS is clear +* (hypervisor context). Now as xSRR1.GS is used to decide whether +* KVM handler will be invoked to handle the exception or host +* host kernel debug handler will be invoked to handle the exception. +* This leads to host kernel debug handler handling the exception +* which should either be handled by KVM. +*/ + mfspr r10, SPRN_EPCR + andis. r10,r10,SPRN_EPCR_DUVD@h + beq+2f + + andis. r10,r9,MSR_GS@h + beq+3f +2: +#endif LOAD_REG_IMMEDIATE(r14,interrupt_base_book3e) LOAD_REG_IMMEDIATE(r15,interrupt_end_book3e) cmpld cr0,r10,r14 @@ -523,7 +550,7 @@ kernel_dbg_exc: blt+cr0,1f bge+cr1,1f - /* here it looks like we got an inappropriate debug exception. */ +3: /* here it looks like we got an inappropriate debug exception. */ lis r14,DBSR_IC@h /* clear the IC event */ rlwinm r11,r11,0,~MSR_DE /* clear DE in the DSRR1 value */ mtspr SPRN_DBSR,r14 diff --git a/arch/powerpc/kernel/head_booke.h b/arch/powerpc/kernel/head_booke.h index 5f051ee..edc6a3b 100644 --- a/arch/powerpc/kernel/head_booke.h +++ b/arch/powerpc/kernel/head_booke.h @@ -285,7 +285,33 @@ label: mfspr r10,SPRN_DBSR; /* check single-step/branch taken */ \ andis. r10,r10,(DBSR_IC|DBSR_BT)@h; \ beq+2f; \ +#ifdef CONFIG_KVM_BOOKE_HV \ + /*\ +* EPCR.DUVD controls whether the debug events can come in\ +* hypervisor mode or not. When KVM guest is using the debug \ +* resource then we do not want debug events to be captured \ +* in guest entry/exit path. So we set EPCR.DUVD when entering\ +* and clears
RE: [PATCH] bookehv: Handle debug exception on guest exit
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Thursday, April 04, 2013 6:55 PM To: Bhushan Bharat-R65777 Cc: linuxppc-...@lists.ozlabs.org; k...@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH] bookehv: Handle debug exception on guest exit On 20.03.2013, at 18:45, Bharat Bhushan wrote: EPCR.DUVD controls whether the debug events can come in hypervisor mode or not. When KVM guest is using the debug resource then we do not want debug events to be captured in guest entry/exit path. So we set EPCR.DUVD when entering and clears EPCR.DUVD when exiting from guest. Debug instruction complete is a post-completion debug exception but debug event gets posted on the basis of MSR before the instruction is executed. Now if the instruction switches the context from guest mode (MSR.GS = 1) to hypervisor mode (MSR.GS = 0) then the xSRR0 points to first instruction of KVM handler and xSRR1 points that MSR.GS is clear (hypervisor context). Now as xSRR1.GS is used to decide whether KVM handler will be invoked to handle the exception or host host kernel debug handler will be invoked to handle the exception. This leads to host kernel debug handler handling the exception which should either be handled by KVM. This is tested on e500mc in 32 bit mode Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v0: - Do not apply this change for debug_crit as we do not know those chips have issue or not. - corrected 64bit case branching arch/powerpc/kernel/exceptions-64e.S | 29 - arch/powerpc/kernel/head_booke.h | 26 ++ 2 files changed, 54 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S index 4684e33..8b26294 100644 --- a/arch/powerpc/kernel/exceptions-64e.S +++ b/arch/powerpc/kernel/exceptions-64e.S @@ -516,6 +516,33 @@ kernel_dbg_exc: andis. r15,r14,DBSR_IC@h beq+1f +#ifdef CONFIG_KVM_BOOKE_HV + /* +* EPCR.DUVD controls whether the debug events can come in +* hypervisor mode or not. When KVM guest is using the debug +* resource then we do not want debug events to be captured +* in guest entry/exit path. So we set EPCR.DUVD when entering +* and clears EPCR.DUVD when exiting from guest. +* Debug instruction complete is a post-completion debug +* exception but debug event gets posted on the basis of MSR +* before the instruction is executed. Now if the instruction +* switches the context from guest mode (MSR.GS = 1) to hypervisor +* mode (MSR.GS = 0) then the xSRR0 points to first instruction of Can't we just execute that code path with MSR.DE=0? Single stepping uses DBCR0.IC (instruction complete). Can you describe how MSR.DE = 0 will work? Alex +* KVM handler and xSRR1 points that MSR.GS is clear +* (hypervisor context). Now as xSRR1.GS is used to decide whether +* KVM handler will be invoked to handle the exception or host +* host kernel debug handler will be invoked to handle the exception. +* This leads to host kernel debug handler handling the exception +* which should either be handled by KVM. +*/ + mfspr r10, SPRN_EPCR + andis. r10,r10,SPRN_EPCR_DUVD@h + beq+2f + + andis. r10,r9,MSR_GS@h + beq+3f +2: +#endif LOAD_REG_IMMEDIATE(r14,interrupt_base_book3e) LOAD_REG_IMMEDIATE(r15,interrupt_end_book3e) cmpld cr0,r10,r14 @@ -523,7 +550,7 @@ kernel_dbg_exc: blt+cr0,1f bge+cr1,1f - /* here it looks like we got an inappropriate debug exception. */ +3: /* here it looks like we got an inappropriate debug exception. */ lis r14,DBSR_IC@h /* clear the IC event */ rlwinm r11,r11,0,~MSR_DE /* clear DE in the DSRR1 value */ mtspr SPRN_DBSR,r14 diff --git a/arch/powerpc/kernel/head_booke.h b/arch/powerpc/kernel/head_booke.h index 5f051ee..edc6a3b 100644 --- a/arch/powerpc/kernel/head_booke.h +++ b/arch/powerpc/kernel/head_booke.h @@ -285,7 +285,33 @@ label: mfspr r10,SPRN_DBSR; /* check single-step/branch taken */ \ andis. r10,r10,(DBSR_IC|DBSR_BT)@h; \ beq+2f; \ +#ifdef CONFIG_KVM_BOOKE_HV \ + /*\ +* EPCR.DUVD controls whether the debug events can come in\ +* hypervisor mode or not. When KVM guest is using the debug \ +* resource then we do not want debug events to be captured \ +* in guest entry/exit path. So we set EPCR.DUVD when entering\ +* and clears
RE: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support
-Original Message- From: Wood Scott-B07421 Sent: Tuesday, April 02, 2013 11:30 PM To: Bhushan Bharat-R65777 Cc: Alexander Graf; kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott- B07421 Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support On 04/02/2013 09:09:34 AM, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Tuesday, April 02, 2013 1:57 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421 Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support On 29.03.2013, at 07:04, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Thursday, March 28, 2013 10:06 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support How does the normal debug register switching code work in Linux? Can't we just reuse that? Or rely on it to restore working state when another process gets scheduled in? Good point, I can see debug registers loading in function __switch_to()- switch_booke_debug_regs() in file arch/powerpc/kernel/process.c. So as long as assume that host will not use debug resources we can rely on this restore. But I am not sure that this is a fare assumption. As Scott earlier mentioned someone can use debug resource for kernel debugging also. Someone in the kernel can also use floating point registers. But then it's his responsibility to clean up the mess he leaves behind. I am neither convinced by what you said and nor even have much reason to oppose :) Scott, I remember you mentioned that host can use debug resources, you comment on this ? I thought the conclusion we reached was that it was OK as long as KVM waits until it actually needs the debug resources to mess with the registers. Right, Are we also agreeing on that KVM will not save/restore host debug context on vcpu_load/vcpu_put()? KVM will load its context in vcpu_load() if needed and on vcpu_put() it will clear DBCR0 and DBSR. Thanks -Bharat -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support
-Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Wednesday, April 03, 2013 3:58 PM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; kvm-...@vger.kernel.org; kvm@vger.kernel.org Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support Am 03.04.2013 um 12:03 schrieb Bhushan Bharat-R65777 r65...@freescale.com: -Original Message- From: Wood Scott-B07421 Sent: Tuesday, April 02, 2013 11:30 PM To: Bhushan Bharat-R65777 Cc: Alexander Graf; kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott- B07421 Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support On 04/02/2013 09:09:34 AM, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Tuesday, April 02, 2013 1:57 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421 Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support On 29.03.2013, at 07:04, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Thursday, March 28, 2013 10:06 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support How does the normal debug register switching code work in Linux? Can't we just reuse that? Or rely on it to restore working state when another process gets scheduled in? Good point, I can see debug registers loading in function __switch_to()- switch_booke_debug_regs() in file arch/powerpc/kernel/process.c. So as long as assume that host will not use debug resources we can rely on this restore. But I am not sure that this is a fare assumption. As Scott earlier mentioned someone can use debug resource for kernel debugging also. Someone in the kernel can also use floating point registers. But then it's his responsibility to clean up the mess he leaves behind. I am neither convinced by what you said and nor even have much reason to oppose :) Scott, I remember you mentioned that host can use debug resources, you comment on this ? I thought the conclusion we reached was that it was OK as long as KVM waits until it actually needs the debug resources to mess with the registers. Right, Are we also agreeing on that KVM will not save/restore host debug context on vcpu_load/vcpu_put()? KVM will load its context in vcpu_load() if needed and on vcpu_put() it will clear DBCR0 and DBSR. That depends on whether the kernel restores the debug registers. Please double- check that. Currently the kernel code restore the debug state of new schedule process in context_switch(). switch_booke_debug_regs() from __switch_to() and defined as : /* * Unless neither the old or new thread are making use of the * debug registers, set the debug registers from the values * stored in the new thread. */ static void switch_booke_debug_regs(struct thread_struct *new_thread) { if ((current-thread.dbcr0 DBCR0_IDM) || (new_thread-dbcr0 DBCR0_IDM)) prime_debug_regs(new_thread); } static void prime_debug_regs(struct thread_struct *thread) { mtspr(SPRN_IAC1, thread-iac1); mtspr(SPRN_IAC2, thread-iac2); #if CONFIG_PPC_ADV_DEBUG_IACS 2 mtspr(SPRN_IAC3, thread-iac3); mtspr(SPRN_IAC4, thread-iac4); #endif mtspr(SPRN_DAC1, thread-dac1); mtspr(SPRN_DAC2, thread-dac2); #if CONFIG_PPC_ADV_DEBUG_DVCS 0 mtspr(SPRN_DVC1, thread-dvc1); mtspr(SPRN_DVC2, thread-dvc2); #endif mtspr(SPRN_DBCR0, thread-dbcr0); mtspr(SPRN_DBCR1, thread-dbcr1); #ifdef CONFIG_BOOKE mtspr(SPRN_DBCR2, thread-dbcr2); #endif } This is analogous to moving from guest to/from QEMU. so we can make prime_debug_regs() available to kvm code for heavyweight_exit. And vcpu_load() will load guest state and save host state (update thread-debug_registers). And the kernel exception handling code clear DBSR and load DBCR0 with 0 (global_dbcr0 variable, which is zero) in transfer_to_handler in entry_32.S This is analogous to switching from KVM to kernel. But I do not same (clearing DBCR0 and DBSR) in 64bit exception handler. Is this a problem or I am missing something. Thanks -Bharat Also, someone could want to gdb QEMU, so the debug registers might have to get restored on a heavy weight exit. I'd hope Linux just provides helpers to restore a process's debug state that we can call here. Alex Thanks -Bharat -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Wednesday, April 03, 2013 7:39 PM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; kvm-...@vger.kernel.org; kvm@vger.kernel.org Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support On 03.04.2013, at 15:50, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Wednesday, April 03, 2013 3:58 PM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; kvm-...@vger.kernel.org; kvm@vger.kernel.org Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support Am 03.04.2013 um 12:03 schrieb Bhushan Bharat-R65777 r65...@freescale.com: -Original Message- From: Wood Scott-B07421 Sent: Tuesday, April 02, 2013 11:30 PM To: Bhushan Bharat-R65777 Cc: Alexander Graf; kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott- B07421 Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support On 04/02/2013 09:09:34 AM, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Tuesday, April 02, 2013 1:57 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421 Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support On 29.03.2013, at 07:04, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Thursday, March 28, 2013 10:06 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support How does the normal debug register switching code work in Linux? Can't we just reuse that? Or rely on it to restore working state when another process gets scheduled in? Good point, I can see debug registers loading in function __switch_to()- switch_booke_debug_regs() in file arch/powerpc/kernel/process.c. So as long as assume that host will not use debug resources we can rely on this restore. But I am not sure that this is a fare assumption. As Scott earlier mentioned someone can use debug resource for kernel debugging also. Someone in the kernel can also use floating point registers. But then it's his responsibility to clean up the mess he leaves behind. I am neither convinced by what you said and nor even have much reason to oppose :) Scott, I remember you mentioned that host can use debug resources, you comment on this ? I thought the conclusion we reached was that it was OK as long as KVM waits until it actually needs the debug resources to mess with the registers. Right, Are we also agreeing on that KVM will not save/restore host debug context on vcpu_load/vcpu_put()? KVM will load its context in vcpu_load() if needed and on vcpu_put() it will clear DBCR0 and DBSR. That depends on whether the kernel restores the debug registers. Please double- check that. Currently the kernel code restore the debug state of new schedule process in context_switch(). switch_booke_debug_regs() from __switch_to() and defined as : /* * Unless neither the old or new thread are making use of the * debug registers, set the debug registers from the values * stored in the new thread. */ static void switch_booke_debug_regs(struct thread_struct *new_thread) { if ((current-thread.dbcr0 DBCR0_IDM) || (new_thread-dbcr0 DBCR0_IDM)) prime_debug_regs(new_thread); } static void prime_debug_regs(struct thread_struct *thread) { mtspr(SPRN_IAC1, thread-iac1); mtspr(SPRN_IAC2, thread-iac2); #if CONFIG_PPC_ADV_DEBUG_IACS 2 mtspr(SPRN_IAC3, thread-iac3); mtspr(SPRN_IAC4, thread-iac4); #endif mtspr(SPRN_DAC1, thread-dac1); mtspr(SPRN_DAC2, thread-dac2); #if CONFIG_PPC_ADV_DEBUG_DVCS 0 mtspr(SPRN_DVC1, thread-dvc1); mtspr(SPRN_DVC2, thread-dvc2); #endif mtspr(SPRN_DBCR0, thread-dbcr0); mtspr(SPRN_DBCR1, thread-dbcr1); #ifdef CONFIG_BOOKE mtspr(SPRN_DBCR2, thread-dbcr2); #endif } This is analogous to moving from guest to/from QEMU. so we can make prime_debug_regs() available to kvm code for heavyweight_exit. And vcpu_load() will load guest state and save host state (update thread-debug_registers). I don't think we need to do anything on vcpu_load if we just swap the thread- debug_registers. Just make sure to restore them before you return from a heavy weight exit. My understanding is : 1) When VCPU is running - h/w debug registers have vcpu-arch.debug_registers Goes for heavyweight_exit - h/w debug registers are loaded with thread-debug_registers Return from heavyweight_exit
RE: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Tuesday, April 02, 2013 9:11 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421 Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support On 04/02/2013 04:09 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Tuesday, April 02, 2013 1:57 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421 Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support On 29.03.2013, at 07:04, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Thursday, March 28, 2013 10:06 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support On 21.03.2013, at 07:25, Bharat Bhushan wrote: From: Bharat Bhushanbharat.bhus...@freescale.com This patch adds the debug stub support on booke/bookehv. Now QEMU debug stub can use hw breakpoint, watchpoint and software breakpoint to debug guest. Debug registers are saved/restored on vcpu_put()/vcpu_get(). Also the debug registers are saved restored only if guest is using debug resources. Signed-off-by: Bharat Bhushanbharat.bhus...@freescale.com --- v2: - save/restore in vcpu_get()/vcpu_put() - some more minor cleanup based on review comments. arch/powerpc/include/asm/kvm_host.h | 10 ++ arch/powerpc/include/uapi/asm/kvm.h | 22 +++- arch/powerpc/kvm/booke.c| 252 - -- arch/powerpc/kvm/e500_emulate.c | 10 ++ 4 files changed, 272 insertions(+), 22 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index f4ba881..8571952 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -504,7 +504,17 @@ struct kvm_vcpu_arch { u32 mmucfg; u32 epr; u32 crit_save; + /* guest debug registers*/ struct kvmppc_booke_debug_reg dbg_reg; + /* shadow debug registers */ + struct kvmppc_booke_debug_reg shadow_dbg_reg; + /* host debug registers*/ + struct kvmppc_booke_debug_reg host_dbg_reg; + /* +* Flag indicating that debug registers are used by guest +* and requires save restore. + */ + bool debug_save_restore; #endif gpa_t paddr_accessed; gva_t vaddr_accessed; diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h index 15f9a00..d7ce449 100644 --- a/arch/powerpc/include/uapi/asm/kvm.h +++ b/arch/powerpc/include/uapi/asm/kvm.h @@ -25,6 +25,7 @@ /* Select powerpc specific features inlinux/kvm.h */ #define __KVM_HAVE_SPAPR_TCE #define __KVM_HAVE_PPC_SMT +#define __KVM_HAVE_GUEST_DEBUG struct kvm_regs { __u64 pc; @@ -267,7 +268,24 @@ struct kvm_fpu { __u64 fpr[32]; }; +/* + * Defines for h/w breakpoint, watchpoint (read, write or both) +and + * software breakpoint. + * These are used as type in KVM_SET_GUEST_DEBUG ioctl and status + * for KVM_DEBUG_EXIT. + */ +#define KVMPPC_DEBUG_NONE 0x0 +#define KVMPPC_DEBUG_BREAKPOINT(1UL 1) +#define KVMPPC_DEBUG_WATCH_WRITE (1UL 2) +#define KVMPPC_DEBUG_WATCH_READ(1UL 3) struct kvm_debug_exit_arch { + __u64 address; + /* +* exiting to userspace because of h/w breakpoint, watchpoint +* (read, write or both) and software breakpoint. +*/ + __u32 status; + __u32 reserved; }; /* for KVM_SET_GUEST_DEBUG */ @@ -279,10 +297,6 @@ struct kvm_guest_debug_arch { * Type denotes h/w breakpoint, read watchpoint, write * watchpoint or watchpoint (both read and write). */ -#define KVMPPC_DEBUG_NOTYPE0x0 -#define KVMPPC_DEBUG_BREAKPOINT(1UL 1) -#define KVMPPC_DEBUG_WATCH_WRITE (1UL 2) -#define KVMPPC_DEBUG_WATCH_READ(1UL 3) __u32 type; __u32 reserved; } bp[16]; diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 1de93a8..bf20056 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -133,6 +133,30 @@ static void kvmppc_vcpu_sync_fpu(struct kvm_vcpu *vcpu) #endif } +static void kvmppc_vcpu_sync_debug(struct kvm_vcpu *vcpu) { + /* Synchronize guest's desire to get debug interrupts into +shadow MSR */ #ifndef CONFIG_KVM_BOOKE_HV + vcpu-arch.shadow_msr= ~MSR_DE; + vcpu-arch.shadow_msr |= vcpu-arch.shared-msr
RE: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support
+ dbg_reg =(vcpu-arch.shadow_dbg_reg); + + /* + * On BOOKE (e500v2); Set DBCR1 and DBCR2 to allow debug events + * to occur when MSR.PR is set. + * On BOOKE-HV (e500mc+); MSR.PR = 0 when guest is running. So we + * should clear DBCR1 and DBCR2. + */ +#ifdef CONFIG_KVM_BOOKE_HV + dbg_reg-dbcr1 = 0; + dbg_reg-dbcr2 = 0; Does that mean we can't debug guest user space? Yes This is wrong. Really, So far I am assuming qemu debug stub is not mean for debugging guest application. Ok, let me rephrase: This is confusing. You do trap in PR mode on e500v2. IIRC x86 also traps in kernel and user space. I don't see why e500 hv should be different. I am sorry, I think did not read the document correctly. DBCR1 = 0 ; means the 00 IAC1 debug conditions unaffected by MSR[PR],MSR[GS]. Similarly for dbcr2. So yes the guest user space can be debugged. So why is this conditional on BOOKE_HV then? Wouldn't it make things easier to treat HV and PR identical? On BOOKE-HV we have to keep these to 0, so guest and guest application both can be debugged. Also on HV we have EPCR.DUVD to control that debug events will not come in hypervisor (GS = 0). On BOOKE; guest and guest application both runs in PR = 1 and hypervisor in PR = 0. So with dbcr1/dbcr2 on booke we control debug exception not to come in hypervisor mode still allow guest and its application debugging. Thanks -Bharat -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support
-Original Message- From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Wednesday, April 03, 2013 11:26 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421 Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support Am 03.04.2013 um 19:47 schrieb Bhushan Bharat-R65777 r65...@freescale.com: +dbg_reg =(vcpu-arch.shadow_dbg_reg); + +/* + * On BOOKE (e500v2); Set DBCR1 and DBCR2 to allow debug events + * to occur when MSR.PR is set. + * On BOOKE-HV (e500mc+); MSR.PR = 0 when guest is running. So we + * should clear DBCR1 and DBCR2. + */ +#ifdef CONFIG_KVM_BOOKE_HV +dbg_reg-dbcr1 = 0; +dbg_reg-dbcr2 = 0; Does that mean we can't debug guest user space? Yes This is wrong. Really, So far I am assuming qemu debug stub is not mean for debugging guest application. Ok, let me rephrase: This is confusing. You do trap in PR mode on e500v2. IIRC x86 also traps in kernel and user space. I don't see why e500 hv should be different. I am sorry, I think did not read the document correctly. DBCR1 = 0 ; means the 00 IAC1 debug conditions unaffected by MSR[PR],MSR[GS]. Similarly for dbcr2. So yes the guest user space can be debugged. So why is this conditional on BOOKE_HV then? Wouldn't it make things easier to treat HV and PR identical? On BOOKE-HV we have to keep these to 0, so guest and guest application both can be debugged. Also on HV we have EPCR.DUVD to control that debug events will not come in hypervisor (GS = 0). On BOOKE; guest and guest application both runs in PR = 1 and hypervisor in PR = 0. So with dbcr1/dbcr2 on booke we control debug exception not to come in hypervisor mode still allow guest and its application debugging. Ah, can we group these 2 overrides next to each other with an #ifdef ... #else to make this obvious from the code? I will try :) Thanks -Bharat -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support
-Original Message- From: Wood Scott-B07421 Sent: Tuesday, April 02, 2013 11:30 PM To: Bhushan Bharat-R65777 Cc: Alexander Graf; kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott- B07421 Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support On 04/02/2013 09:09:34 AM, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Tuesday, April 02, 2013 1:57 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421 Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support On 29.03.2013, at 07:04, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Thursday, March 28, 2013 10:06 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support How does the normal debug register switching code work in Linux? Can't we just reuse that? Or rely on it to restore working state when another process gets scheduled in? Good point, I can see debug registers loading in function __switch_to()- switch_booke_debug_regs() in file arch/powerpc/kernel/process.c. So as long as assume that host will not use debug resources we can rely on this restore. But I am not sure that this is a fare assumption. As Scott earlier mentioned someone can use debug resource for kernel debugging also. Someone in the kernel can also use floating point registers. But then it's his responsibility to clean up the mess he leaves behind. I am neither convinced by what you said and nor even have much reason to oppose :) Scott, I remember you mentioned that host can use debug resources, you comment on this ? I thought the conclusion we reached was that it was OK as long as KVM waits until it actually needs the debug resources to mess with the registers. Right, Are we also agreeing on that KVM will not save/restore host debug context on vcpu_load/vcpu_put()? KVM will load its context in vcpu_load() if needed and on vcpu_put() it will clear DBCR0 and DBSR. Thanks -Bharat -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Wednesday, April 03, 2013 7:39 PM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; kvm-ppc@vger.kernel.org; k...@vger.kernel.org Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support On 03.04.2013, at 15:50, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Wednesday, April 03, 2013 3:58 PM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; kvm-ppc@vger.kernel.org; k...@vger.kernel.org Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support Am 03.04.2013 um 12:03 schrieb Bhushan Bharat-R65777 r65...@freescale.com: -Original Message- From: Wood Scott-B07421 Sent: Tuesday, April 02, 2013 11:30 PM To: Bhushan Bharat-R65777 Cc: Alexander Graf; kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott- B07421 Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support On 04/02/2013 09:09:34 AM, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Tuesday, April 02, 2013 1:57 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421 Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support On 29.03.2013, at 07:04, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Thursday, March 28, 2013 10:06 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support How does the normal debug register switching code work in Linux? Can't we just reuse that? Or rely on it to restore working state when another process gets scheduled in? Good point, I can see debug registers loading in function __switch_to()- switch_booke_debug_regs() in file arch/powerpc/kernel/process.c. So as long as assume that host will not use debug resources we can rely on this restore. But I am not sure that this is a fare assumption. As Scott earlier mentioned someone can use debug resource for kernel debugging also. Someone in the kernel can also use floating point registers. But then it's his responsibility to clean up the mess he leaves behind. I am neither convinced by what you said and nor even have much reason to oppose :) Scott, I remember you mentioned that host can use debug resources, you comment on this ? I thought the conclusion we reached was that it was OK as long as KVM waits until it actually needs the debug resources to mess with the registers. Right, Are we also agreeing on that KVM will not save/restore host debug context on vcpu_load/vcpu_put()? KVM will load its context in vcpu_load() if needed and on vcpu_put() it will clear DBCR0 and DBSR. That depends on whether the kernel restores the debug registers. Please double- check that. Currently the kernel code restore the debug state of new schedule process in context_switch(). switch_booke_debug_regs() from __switch_to() and defined as : /* * Unless neither the old or new thread are making use of the * debug registers, set the debug registers from the values * stored in the new thread. */ static void switch_booke_debug_regs(struct thread_struct *new_thread) { if ((current-thread.dbcr0 DBCR0_IDM) || (new_thread-dbcr0 DBCR0_IDM)) prime_debug_regs(new_thread); } static void prime_debug_regs(struct thread_struct *thread) { mtspr(SPRN_IAC1, thread-iac1); mtspr(SPRN_IAC2, thread-iac2); #if CONFIG_PPC_ADV_DEBUG_IACS 2 mtspr(SPRN_IAC3, thread-iac3); mtspr(SPRN_IAC4, thread-iac4); #endif mtspr(SPRN_DAC1, thread-dac1); mtspr(SPRN_DAC2, thread-dac2); #if CONFIG_PPC_ADV_DEBUG_DVCS 0 mtspr(SPRN_DVC1, thread-dvc1); mtspr(SPRN_DVC2, thread-dvc2); #endif mtspr(SPRN_DBCR0, thread-dbcr0); mtspr(SPRN_DBCR1, thread-dbcr1); #ifdef CONFIG_BOOKE mtspr(SPRN_DBCR2, thread-dbcr2); #endif } This is analogous to moving from guest to/from QEMU. so we can make prime_debug_regs() available to kvm code for heavyweight_exit. And vcpu_load() will load guest state and save host state (update thread-debug_registers). I don't think we need to do anything on vcpu_load if we just swap the thread- debug_registers. Just make sure to restore them before you return from a heavy weight exit. My understanding is : 1) When VCPU is running - h/w debug registers have vcpu-arch.debug_registers Goes for heavyweight_exit - h/w debug registers are loaded with thread-debug_registers Return from
RE: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Tuesday, April 02, 2013 9:11 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421 Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support On 04/02/2013 04:09 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Tuesday, April 02, 2013 1:57 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421 Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support On 29.03.2013, at 07:04, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Thursday, March 28, 2013 10:06 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support On 21.03.2013, at 07:25, Bharat Bhushan wrote: From: Bharat Bhushanbharat.bhus...@freescale.com This patch adds the debug stub support on booke/bookehv. Now QEMU debug stub can use hw breakpoint, watchpoint and software breakpoint to debug guest. Debug registers are saved/restored on vcpu_put()/vcpu_get(). Also the debug registers are saved restored only if guest is using debug resources. Signed-off-by: Bharat Bhushanbharat.bhus...@freescale.com --- v2: - save/restore in vcpu_get()/vcpu_put() - some more minor cleanup based on review comments. arch/powerpc/include/asm/kvm_host.h | 10 ++ arch/powerpc/include/uapi/asm/kvm.h | 22 +++- arch/powerpc/kvm/booke.c| 252 - -- arch/powerpc/kvm/e500_emulate.c | 10 ++ 4 files changed, 272 insertions(+), 22 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index f4ba881..8571952 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -504,7 +504,17 @@ struct kvm_vcpu_arch { u32 mmucfg; u32 epr; u32 crit_save; + /* guest debug registers*/ struct kvmppc_booke_debug_reg dbg_reg; + /* shadow debug registers */ + struct kvmppc_booke_debug_reg shadow_dbg_reg; + /* host debug registers*/ + struct kvmppc_booke_debug_reg host_dbg_reg; + /* +* Flag indicating that debug registers are used by guest +* and requires save restore. + */ + bool debug_save_restore; #endif gpa_t paddr_accessed; gva_t vaddr_accessed; diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h index 15f9a00..d7ce449 100644 --- a/arch/powerpc/include/uapi/asm/kvm.h +++ b/arch/powerpc/include/uapi/asm/kvm.h @@ -25,6 +25,7 @@ /* Select powerpc specific features inlinux/kvm.h */ #define __KVM_HAVE_SPAPR_TCE #define __KVM_HAVE_PPC_SMT +#define __KVM_HAVE_GUEST_DEBUG struct kvm_regs { __u64 pc; @@ -267,7 +268,24 @@ struct kvm_fpu { __u64 fpr[32]; }; +/* + * Defines for h/w breakpoint, watchpoint (read, write or both) +and + * software breakpoint. + * These are used as type in KVM_SET_GUEST_DEBUG ioctl and status + * for KVM_DEBUG_EXIT. + */ +#define KVMPPC_DEBUG_NONE 0x0 +#define KVMPPC_DEBUG_BREAKPOINT(1UL 1) +#define KVMPPC_DEBUG_WATCH_WRITE (1UL 2) +#define KVMPPC_DEBUG_WATCH_READ(1UL 3) struct kvm_debug_exit_arch { + __u64 address; + /* +* exiting to userspace because of h/w breakpoint, watchpoint +* (read, write or both) and software breakpoint. +*/ + __u32 status; + __u32 reserved; }; /* for KVM_SET_GUEST_DEBUG */ @@ -279,10 +297,6 @@ struct kvm_guest_debug_arch { * Type denotes h/w breakpoint, read watchpoint, write * watchpoint or watchpoint (both read and write). */ -#define KVMPPC_DEBUG_NOTYPE0x0 -#define KVMPPC_DEBUG_BREAKPOINT(1UL 1) -#define KVMPPC_DEBUG_WATCH_WRITE (1UL 2) -#define KVMPPC_DEBUG_WATCH_READ(1UL 3) __u32 type; __u32 reserved; } bp[16]; diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 1de93a8..bf20056 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -133,6 +133,30 @@ static void kvmppc_vcpu_sync_fpu(struct kvm_vcpu *vcpu) #endif } +static void kvmppc_vcpu_sync_debug(struct kvm_vcpu *vcpu) { + /* Synchronize guest's desire to get debug interrupts into +shadow MSR */ #ifndef CONFIG_KVM_BOOKE_HV + vcpu-arch.shadow_msr= ~MSR_DE; + vcpu-arch.shadow_msr |= vcpu-arch.shared-msr
RE: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support
+ dbg_reg =(vcpu-arch.shadow_dbg_reg); + + /* + * On BOOKE (e500v2); Set DBCR1 and DBCR2 to allow debug events + * to occur when MSR.PR is set. + * On BOOKE-HV (e500mc+); MSR.PR = 0 when guest is running. So we + * should clear DBCR1 and DBCR2. + */ +#ifdef CONFIG_KVM_BOOKE_HV + dbg_reg-dbcr1 = 0; + dbg_reg-dbcr2 = 0; Does that mean we can't debug guest user space? Yes This is wrong. Really, So far I am assuming qemu debug stub is not mean for debugging guest application. Ok, let me rephrase: This is confusing. You do trap in PR mode on e500v2. IIRC x86 also traps in kernel and user space. I don't see why e500 hv should be different. I am sorry, I think did not read the document correctly. DBCR1 = 0 ; means the 00 IAC1 debug conditions unaffected by MSR[PR],MSR[GS]. Similarly for dbcr2. So yes the guest user space can be debugged. So why is this conditional on BOOKE_HV then? Wouldn't it make things easier to treat HV and PR identical? On BOOKE-HV we have to keep these to 0, so guest and guest application both can be debugged. Also on HV we have EPCR.DUVD to control that debug events will not come in hypervisor (GS = 0). On BOOKE; guest and guest application both runs in PR = 1 and hypervisor in PR = 0. So with dbcr1/dbcr2 on booke we control debug exception not to come in hypervisor mode still allow guest and its application debugging. Thanks -Bharat -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Tuesday, April 02, 2013 1:57 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421 Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support On 29.03.2013, at 07:04, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Thursday, March 28, 2013 10:06 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support On 21.03.2013, at 07:25, Bharat Bhushan wrote: From: Bharat Bhushan bharat.bhus...@freescale.com This patch adds the debug stub support on booke/bookehv. Now QEMU debug stub can use hw breakpoint, watchpoint and software breakpoint to debug guest. Debug registers are saved/restored on vcpu_put()/vcpu_get(). Also the debug registers are saved restored only if guest is using debug resources. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v2: - save/restore in vcpu_get()/vcpu_put() - some more minor cleanup based on review comments. arch/powerpc/include/asm/kvm_host.h | 10 ++ arch/powerpc/include/uapi/asm/kvm.h | 22 +++- arch/powerpc/kvm/booke.c| 252 - -- arch/powerpc/kvm/e500_emulate.c | 10 ++ 4 files changed, 272 insertions(+), 22 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index f4ba881..8571952 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -504,7 +504,17 @@ struct kvm_vcpu_arch { u32 mmucfg; u32 epr; u32 crit_save; + /* guest debug registers*/ struct kvmppc_booke_debug_reg dbg_reg; + /* shadow debug registers */ + struct kvmppc_booke_debug_reg shadow_dbg_reg; + /* host debug registers*/ + struct kvmppc_booke_debug_reg host_dbg_reg; + /* + * Flag indicating that debug registers are used by guest + * and requires save restore. + */ + bool debug_save_restore; #endif gpa_t paddr_accessed; gva_t vaddr_accessed; diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h index 15f9a00..d7ce449 100644 --- a/arch/powerpc/include/uapi/asm/kvm.h +++ b/arch/powerpc/include/uapi/asm/kvm.h @@ -25,6 +25,7 @@ /* Select powerpc specific features in linux/kvm.h */ #define __KVM_HAVE_SPAPR_TCE #define __KVM_HAVE_PPC_SMT +#define __KVM_HAVE_GUEST_DEBUG struct kvm_regs { __u64 pc; @@ -267,7 +268,24 @@ struct kvm_fpu { __u64 fpr[32]; }; +/* + * Defines for h/w breakpoint, watchpoint (read, write or both) and + * software breakpoint. + * These are used as type in KVM_SET_GUEST_DEBUG ioctl and status + * for KVM_DEBUG_EXIT. + */ +#define KVMPPC_DEBUG_NONE0x0 +#define KVMPPC_DEBUG_BREAKPOINT (1UL 1) +#define KVMPPC_DEBUG_WATCH_WRITE (1UL 2) +#define KVMPPC_DEBUG_WATCH_READ (1UL 3) struct kvm_debug_exit_arch { + __u64 address; + /* + * exiting to userspace because of h/w breakpoint, watchpoint + * (read, write or both) and software breakpoint. + */ + __u32 status; + __u32 reserved; }; /* for KVM_SET_GUEST_DEBUG */ @@ -279,10 +297,6 @@ struct kvm_guest_debug_arch { * Type denotes h/w breakpoint, read watchpoint, write * watchpoint or watchpoint (both read and write). */ -#define KVMPPC_DEBUG_NOTYPE 0x0 -#define KVMPPC_DEBUG_BREAKPOINT (1UL 1) -#define KVMPPC_DEBUG_WATCH_WRITE (1UL 2) -#define KVMPPC_DEBUG_WATCH_READ (1UL 3) __u32 type; __u32 reserved; } bp[16]; diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 1de93a8..bf20056 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -133,6 +133,30 @@ static void kvmppc_vcpu_sync_fpu(struct kvm_vcpu *vcpu) #endif } +static void kvmppc_vcpu_sync_debug(struct kvm_vcpu *vcpu) { + /* Synchronize guest's desire to get debug interrupts into shadow +MSR */ #ifndef CONFIG_KVM_BOOKE_HV + vcpu-arch.shadow_msr = ~MSR_DE; + vcpu-arch.shadow_msr |= vcpu-arch.shared-msr MSR_DE; #endif + + /* Force enable debug interrupts when user space wants to debug */ + if (vcpu-guest_debug) { +#ifdef CONFIG_KVM_BOOKE_HV + /* + * Since there is no shadow MSR, sync MSR_DE into the guest + * visible MSR. Do not allow guest to change MSR[DE]. + */ + vcpu-arch.shared-msr |= MSR_DE; + mtspr(SPRN_MSRP, mfspr(SPRN_MSRP) | MSRP_DEP); This mtspr should really just be a bit or in shadow_mspr when guest_debug gets enabled. It should automatically get synchronized as soon as the next
RE: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Tuesday, April 02, 2013 1:57 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421 Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support On 29.03.2013, at 07:04, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Thursday, March 28, 2013 10:06 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support On 21.03.2013, at 07:25, Bharat Bhushan wrote: From: Bharat Bhushan bharat.bhus...@freescale.com This patch adds the debug stub support on booke/bookehv. Now QEMU debug stub can use hw breakpoint, watchpoint and software breakpoint to debug guest. Debug registers are saved/restored on vcpu_put()/vcpu_get(). Also the debug registers are saved restored only if guest is using debug resources. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v2: - save/restore in vcpu_get()/vcpu_put() - some more minor cleanup based on review comments. arch/powerpc/include/asm/kvm_host.h | 10 ++ arch/powerpc/include/uapi/asm/kvm.h | 22 +++- arch/powerpc/kvm/booke.c| 252 - -- arch/powerpc/kvm/e500_emulate.c | 10 ++ 4 files changed, 272 insertions(+), 22 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index f4ba881..8571952 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -504,7 +504,17 @@ struct kvm_vcpu_arch { u32 mmucfg; u32 epr; u32 crit_save; + /* guest debug registers*/ struct kvmppc_booke_debug_reg dbg_reg; + /* shadow debug registers */ + struct kvmppc_booke_debug_reg shadow_dbg_reg; + /* host debug registers*/ + struct kvmppc_booke_debug_reg host_dbg_reg; + /* + * Flag indicating that debug registers are used by guest + * and requires save restore. + */ + bool debug_save_restore; #endif gpa_t paddr_accessed; gva_t vaddr_accessed; diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h index 15f9a00..d7ce449 100644 --- a/arch/powerpc/include/uapi/asm/kvm.h +++ b/arch/powerpc/include/uapi/asm/kvm.h @@ -25,6 +25,7 @@ /* Select powerpc specific features in linux/kvm.h */ #define __KVM_HAVE_SPAPR_TCE #define __KVM_HAVE_PPC_SMT +#define __KVM_HAVE_GUEST_DEBUG struct kvm_regs { __u64 pc; @@ -267,7 +268,24 @@ struct kvm_fpu { __u64 fpr[32]; }; +/* + * Defines for h/w breakpoint, watchpoint (read, write or both) and + * software breakpoint. + * These are used as type in KVM_SET_GUEST_DEBUG ioctl and status + * for KVM_DEBUG_EXIT. + */ +#define KVMPPC_DEBUG_NONE0x0 +#define KVMPPC_DEBUG_BREAKPOINT (1UL 1) +#define KVMPPC_DEBUG_WATCH_WRITE (1UL 2) +#define KVMPPC_DEBUG_WATCH_READ (1UL 3) struct kvm_debug_exit_arch { + __u64 address; + /* + * exiting to userspace because of h/w breakpoint, watchpoint + * (read, write or both) and software breakpoint. + */ + __u32 status; + __u32 reserved; }; /* for KVM_SET_GUEST_DEBUG */ @@ -279,10 +297,6 @@ struct kvm_guest_debug_arch { * Type denotes h/w breakpoint, read watchpoint, write * watchpoint or watchpoint (both read and write). */ -#define KVMPPC_DEBUG_NOTYPE 0x0 -#define KVMPPC_DEBUG_BREAKPOINT (1UL 1) -#define KVMPPC_DEBUG_WATCH_WRITE (1UL 2) -#define KVMPPC_DEBUG_WATCH_READ (1UL 3) __u32 type; __u32 reserved; } bp[16]; diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 1de93a8..bf20056 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -133,6 +133,30 @@ static void kvmppc_vcpu_sync_fpu(struct kvm_vcpu *vcpu) #endif } +static void kvmppc_vcpu_sync_debug(struct kvm_vcpu *vcpu) { + /* Synchronize guest's desire to get debug interrupts into shadow +MSR */ #ifndef CONFIG_KVM_BOOKE_HV + vcpu-arch.shadow_msr = ~MSR_DE; + vcpu-arch.shadow_msr |= vcpu-arch.shared-msr MSR_DE; #endif + + /* Force enable debug interrupts when user space wants to debug */ + if (vcpu-guest_debug) { +#ifdef CONFIG_KVM_BOOKE_HV + /* + * Since there is no shadow MSR, sync MSR_DE into the guest + * visible MSR. Do not allow guest to change MSR[DE]. + */ + vcpu-arch.shared-msr |= MSR_DE; + mtspr(SPRN_MSRP, mfspr(SPRN_MSRP) | MSRP_DEP); This mtspr should really just be a bit or in shadow_mspr when guest_debug gets enabled. It should automatically get synchronized as soon as the next
RE: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Thursday, March 28, 2013 10:06 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support On 21.03.2013, at 07:25, Bharat Bhushan wrote: From: Bharat Bhushan bharat.bhus...@freescale.com This patch adds the debug stub support on booke/bookehv. Now QEMU debug stub can use hw breakpoint, watchpoint and software breakpoint to debug guest. Debug registers are saved/restored on vcpu_put()/vcpu_get(). Also the debug registers are saved restored only if guest is using debug resources. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v2: - save/restore in vcpu_get()/vcpu_put() - some more minor cleanup based on review comments. arch/powerpc/include/asm/kvm_host.h | 10 ++ arch/powerpc/include/uapi/asm/kvm.h | 22 +++- arch/powerpc/kvm/booke.c| 252 --- arch/powerpc/kvm/e500_emulate.c | 10 ++ 4 files changed, 272 insertions(+), 22 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index f4ba881..8571952 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -504,7 +504,17 @@ struct kvm_vcpu_arch { u32 mmucfg; u32 epr; u32 crit_save; + /* guest debug registers*/ struct kvmppc_booke_debug_reg dbg_reg; + /* shadow debug registers */ + struct kvmppc_booke_debug_reg shadow_dbg_reg; + /* host debug registers*/ + struct kvmppc_booke_debug_reg host_dbg_reg; + /* +* Flag indicating that debug registers are used by guest +* and requires save restore. + */ + bool debug_save_restore; #endif gpa_t paddr_accessed; gva_t vaddr_accessed; diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h index 15f9a00..d7ce449 100644 --- a/arch/powerpc/include/uapi/asm/kvm.h +++ b/arch/powerpc/include/uapi/asm/kvm.h @@ -25,6 +25,7 @@ /* Select powerpc specific features in linux/kvm.h */ #define __KVM_HAVE_SPAPR_TCE #define __KVM_HAVE_PPC_SMT +#define __KVM_HAVE_GUEST_DEBUG struct kvm_regs { __u64 pc; @@ -267,7 +268,24 @@ struct kvm_fpu { __u64 fpr[32]; }; +/* + * Defines for h/w breakpoint, watchpoint (read, write or both) and + * software breakpoint. + * These are used as type in KVM_SET_GUEST_DEBUG ioctl and status + * for KVM_DEBUG_EXIT. + */ +#define KVMPPC_DEBUG_NONE 0x0 +#define KVMPPC_DEBUG_BREAKPOINT(1UL 1) +#define KVMPPC_DEBUG_WATCH_WRITE (1UL 2) +#define KVMPPC_DEBUG_WATCH_READ(1UL 3) struct kvm_debug_exit_arch { + __u64 address; + /* +* exiting to userspace because of h/w breakpoint, watchpoint +* (read, write or both) and software breakpoint. +*/ + __u32 status; + __u32 reserved; }; /* for KVM_SET_GUEST_DEBUG */ @@ -279,10 +297,6 @@ struct kvm_guest_debug_arch { * Type denotes h/w breakpoint, read watchpoint, write * watchpoint or watchpoint (both read and write). */ -#define KVMPPC_DEBUG_NOTYPE0x0 -#define KVMPPC_DEBUG_BREAKPOINT(1UL 1) -#define KVMPPC_DEBUG_WATCH_WRITE (1UL 2) -#define KVMPPC_DEBUG_WATCH_READ(1UL 3) __u32 type; __u32 reserved; } bp[16]; diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 1de93a8..bf20056 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -133,6 +133,30 @@ static void kvmppc_vcpu_sync_fpu(struct kvm_vcpu *vcpu) #endif } +static void kvmppc_vcpu_sync_debug(struct kvm_vcpu *vcpu) { + /* Synchronize guest's desire to get debug interrupts into shadow +MSR */ #ifndef CONFIG_KVM_BOOKE_HV + vcpu-arch.shadow_msr = ~MSR_DE; + vcpu-arch.shadow_msr |= vcpu-arch.shared-msr MSR_DE; #endif + + /* Force enable debug interrupts when user space wants to debug */ + if (vcpu-guest_debug) { +#ifdef CONFIG_KVM_BOOKE_HV + /* +* Since there is no shadow MSR, sync MSR_DE into the guest +* visible MSR. Do not allow guest to change MSR[DE]. +*/ + vcpu-arch.shared-msr |= MSR_DE; + mtspr(SPRN_MSRP, mfspr(SPRN_MSRP) | MSRP_DEP); This mtspr should really just be a bit or in shadow_mspr when guest_debug gets enabled. It should automatically get synchronized as soon as the next vpcu_load() happens. I think this is not required here as shadow_dbsr already have MSRP_DEP set. Will setup shadow_msrp when setting guest_debug and clear shadow_msrp when guest_debug is cleared. But that will also not be sufficient as it not sure when vcpu_load
RE: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Thursday, March 28, 2013 10:06 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support On 21.03.2013, at 07:25, Bharat Bhushan wrote: From: Bharat Bhushan bharat.bhus...@freescale.com This patch adds the debug stub support on booke/bookehv. Now QEMU debug stub can use hw breakpoint, watchpoint and software breakpoint to debug guest. Debug registers are saved/restored on vcpu_put()/vcpu_get(). Also the debug registers are saved restored only if guest is using debug resources. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v2: - save/restore in vcpu_get()/vcpu_put() - some more minor cleanup based on review comments. arch/powerpc/include/asm/kvm_host.h | 10 ++ arch/powerpc/include/uapi/asm/kvm.h | 22 +++- arch/powerpc/kvm/booke.c| 252 --- arch/powerpc/kvm/e500_emulate.c | 10 ++ 4 files changed, 272 insertions(+), 22 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index f4ba881..8571952 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -504,7 +504,17 @@ struct kvm_vcpu_arch { u32 mmucfg; u32 epr; u32 crit_save; + /* guest debug registers*/ struct kvmppc_booke_debug_reg dbg_reg; + /* shadow debug registers */ + struct kvmppc_booke_debug_reg shadow_dbg_reg; + /* host debug registers*/ + struct kvmppc_booke_debug_reg host_dbg_reg; + /* +* Flag indicating that debug registers are used by guest +* and requires save restore. + */ + bool debug_save_restore; #endif gpa_t paddr_accessed; gva_t vaddr_accessed; diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h index 15f9a00..d7ce449 100644 --- a/arch/powerpc/include/uapi/asm/kvm.h +++ b/arch/powerpc/include/uapi/asm/kvm.h @@ -25,6 +25,7 @@ /* Select powerpc specific features in linux/kvm.h */ #define __KVM_HAVE_SPAPR_TCE #define __KVM_HAVE_PPC_SMT +#define __KVM_HAVE_GUEST_DEBUG struct kvm_regs { __u64 pc; @@ -267,7 +268,24 @@ struct kvm_fpu { __u64 fpr[32]; }; +/* + * Defines for h/w breakpoint, watchpoint (read, write or both) and + * software breakpoint. + * These are used as type in KVM_SET_GUEST_DEBUG ioctl and status + * for KVM_DEBUG_EXIT. + */ +#define KVMPPC_DEBUG_NONE 0x0 +#define KVMPPC_DEBUG_BREAKPOINT(1UL 1) +#define KVMPPC_DEBUG_WATCH_WRITE (1UL 2) +#define KVMPPC_DEBUG_WATCH_READ(1UL 3) struct kvm_debug_exit_arch { + __u64 address; + /* +* exiting to userspace because of h/w breakpoint, watchpoint +* (read, write or both) and software breakpoint. +*/ + __u32 status; + __u32 reserved; }; /* for KVM_SET_GUEST_DEBUG */ @@ -279,10 +297,6 @@ struct kvm_guest_debug_arch { * Type denotes h/w breakpoint, read watchpoint, write * watchpoint or watchpoint (both read and write). */ -#define KVMPPC_DEBUG_NOTYPE0x0 -#define KVMPPC_DEBUG_BREAKPOINT(1UL 1) -#define KVMPPC_DEBUG_WATCH_WRITE (1UL 2) -#define KVMPPC_DEBUG_WATCH_READ(1UL 3) __u32 type; __u32 reserved; } bp[16]; diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 1de93a8..bf20056 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -133,6 +133,30 @@ static void kvmppc_vcpu_sync_fpu(struct kvm_vcpu *vcpu) #endif } +static void kvmppc_vcpu_sync_debug(struct kvm_vcpu *vcpu) { + /* Synchronize guest's desire to get debug interrupts into shadow +MSR */ #ifndef CONFIG_KVM_BOOKE_HV + vcpu-arch.shadow_msr = ~MSR_DE; + vcpu-arch.shadow_msr |= vcpu-arch.shared-msr MSR_DE; #endif + + /* Force enable debug interrupts when user space wants to debug */ + if (vcpu-guest_debug) { +#ifdef CONFIG_KVM_BOOKE_HV + /* +* Since there is no shadow MSR, sync MSR_DE into the guest +* visible MSR. Do not allow guest to change MSR[DE]. +*/ + vcpu-arch.shared-msr |= MSR_DE; + mtspr(SPRN_MSRP, mfspr(SPRN_MSRP) | MSRP_DEP); This mtspr should really just be a bit or in shadow_mspr when guest_debug gets enabled. It should automatically get synchronized as soon as the next vpcu_load() happens. I think this is not required here as shadow_dbsr already have MSRP_DEP set. Will setup shadow_msrp when setting guest_debug and clear shadow_msrp when guest_debug is cleared. But that will also not be sufficient as it not sure when vcpu_load
RE: [PATCH 2/4 v2] KVM: PPC: debug stub interface parameter defined
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Friday, March 29, 2013 7:26 AM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/4 v2] KVM: PPC: debug stub interface parameter defined On 21.03.2013, at 07:24, Bharat Bhushan wrote: From: Bharat Bhushan bharat.bhus...@freescale.com This patch defines the interface parameter for KVM_SET_GUEST_DEBUG ioctl support. Follow up patches will use this for setting up hardware breakpoints, watchpoints and software breakpoints. Also kvm_arch_vcpu_ioctl_set_guest_debug() is brought one level below. This is because I am not sure what is required for book3s. So this ioctl behaviour will not change for book3s. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v2: - No Change arch/powerpc/include/uapi/asm/kvm.h | 23 +++ arch/powerpc/kvm/book3s.c |6 ++ arch/powerpc/kvm/booke.c|6 ++ arch/powerpc/kvm/powerpc.c |6 -- 4 files changed, 35 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h index c2ff99c..15f9a00 100644 --- a/arch/powerpc/include/uapi/asm/kvm.h +++ b/arch/powerpc/include/uapi/asm/kvm.h @@ -272,8 +272,31 @@ struct kvm_debug_exit_arch { /* for KVM_SET_GUEST_DEBUG */ struct kvm_guest_debug_arch { + struct { + /* H/W breakpoint/watchpoint address */ + __u64 addr; + /* +* Type denotes h/w breakpoint, read watchpoint, write +* watchpoint or watchpoint (both read and write). +*/ +#define KVMPPC_DEBUG_NOTYPE0x0 +#define KVMPPC_DEBUG_BREAKPOINT(1UL 1) +#define KVMPPC_DEBUG_WATCH_WRITE (1UL 2) +#define KVMPPC_DEBUG_WATCH_READ(1UL 3) Are you sure you want to introduce these here, just to remove them again in a later patch? Up to this patch the scope was limited to this structure. So for clarity I defined here and later the scope expands so moved out of this structure. I do not think this really matters, let me know how you want to see ? -Bharat Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/4 v2] KVM: PPC: debug stub interface parameter defined
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Friday, March 29, 2013 7:26 AM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/4 v2] KVM: PPC: debug stub interface parameter defined On 21.03.2013, at 07:24, Bharat Bhushan wrote: From: Bharat Bhushan bharat.bhus...@freescale.com This patch defines the interface parameter for KVM_SET_GUEST_DEBUG ioctl support. Follow up patches will use this for setting up hardware breakpoints, watchpoints and software breakpoints. Also kvm_arch_vcpu_ioctl_set_guest_debug() is brought one level below. This is because I am not sure what is required for book3s. So this ioctl behaviour will not change for book3s. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v2: - No Change arch/powerpc/include/uapi/asm/kvm.h | 23 +++ arch/powerpc/kvm/book3s.c |6 ++ arch/powerpc/kvm/booke.c|6 ++ arch/powerpc/kvm/powerpc.c |6 -- 4 files changed, 35 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h index c2ff99c..15f9a00 100644 --- a/arch/powerpc/include/uapi/asm/kvm.h +++ b/arch/powerpc/include/uapi/asm/kvm.h @@ -272,8 +272,31 @@ struct kvm_debug_exit_arch { /* for KVM_SET_GUEST_DEBUG */ struct kvm_guest_debug_arch { + struct { + /* H/W breakpoint/watchpoint address */ + __u64 addr; + /* +* Type denotes h/w breakpoint, read watchpoint, write +* watchpoint or watchpoint (both read and write). +*/ +#define KVMPPC_DEBUG_NOTYPE0x0 +#define KVMPPC_DEBUG_BREAKPOINT(1UL 1) +#define KVMPPC_DEBUG_WATCH_WRITE (1UL 2) +#define KVMPPC_DEBUG_WATCH_READ(1UL 3) Are you sure you want to introduce these here, just to remove them again in a later patch? Up to this patch the scope was limited to this structure. So for clarity I defined here and later the scope expands so moved out of this structure. I do not think this really matters, let me know how you want to see ? -Bharat Alex -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 6/7] Rename EMULATE_DO_PAPR to EMULATE_EXIT_USER
-Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Thursday, March 07, 2013 4:17 PM To: Wood Scott-B07421 Cc: Bhushan Bharat-R65777; kvm-...@vger.kernel.org; kvm@vger.kernel.org; Bhushan Bharat-R65777 Subject: Re: [PATCH 6/7] Rename EMULATE_DO_PAPR to EMULATE_EXIT_USER On 28.02.2013, at 17:53, Scott Wood wrote: On 02/28/2013 10:51:10 AM, Alexander Graf wrote: On 28.02.2013, at 17:31, Scott Wood wrote: On 02/27/2013 10:13:15 PM, Bharat Bhushan wrote: Instruction emulation return EMULATE_DO_PAPR when it requires exit to userspace on book3s. Similar return is required for booke. EMULATE_DO_PAPR reads out to be confusing so it is renamed to EMULATE_EXIT_USER. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/include/asm/kvm_ppc.h |2 +- arch/powerpc/kvm/book3s_emulate.c |2 +- arch/powerpc/kvm/book3s_pr.c |2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index 44a657a..8b81468 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -44,7 +44,7 @@ enum emulation_result { EMULATE_DO_DCR, /* kvm_run filled with DCR request */ EMULATE_FAIL, /* can't emulate this instruction */ EMULATE_AGAIN,/* something went wrong. go again */ - EMULATE_DO_PAPR, /* kvm_run filled with PAPR request */ + EMULATE_EXIT_USER,/* emulation requires exit to user-space */ }; extern int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu); diff --git a/arch/powerpc/kvm/book3s_emulate.c b/arch/powerpc/kvm/book3s_emulate.c index 836c569..cdd19d6 100644 --- a/arch/powerpc/kvm/book3s_emulate.c +++ b/arch/powerpc/kvm/book3s_emulate.c @@ -194,7 +194,7 @@ int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu, run-papr_hcall.args[i] = gpr; } - emulated = EMULATE_DO_PAPR; + emulated = EMULATE_EXIT_USER; break; } #endif diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c index 73ed11c..8df2d2d 100644 --- a/arch/powerpc/kvm/book3s_pr.c +++ b/arch/powerpc/kvm/book3s_pr.c @@ -760,7 +760,7 @@ program_interrupt: run-exit_reason = KVM_EXIT_MMIO; r = RESUME_HOST_NV; break; - case EMULATE_DO_PAPR: + case EMULATE_EXIT_USER: run-exit_reason = KVM_EXIT_PAPR_HCALL; vcpu-arch.hcall_needed = 1; r = RESUME_HOST_NV; I don't think it makes sense to genericize this. It makes sense if the run-exit_reason = ... and hcall_needed = ... lines get pulled into the emulator. That would be fine. Bharat, did I miss a new patch version with that mess up there fixed? Do you mean moving run-exit_reason = ... and vcpu-arch.hcall_needed = ... into arch/powerpc/kvm/book3s_emulate.c ? If yes, then no you did not miss :) as I have not sent. I will send the new patch with other patches in the patch-set. -Bharat Alex -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 7/7] KVM: PPC: Add userspace debug stub support
-Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Thursday, March 14, 2013 5:20 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421 Subject: Re: [PATCH 7/7] KVM: PPC: Add userspace debug stub support On 14.03.2013, at 06:18, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Thursday, March 07, 2013 7:09 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 7/7] KVM: PPC: Add userspace debug stub support On 28.02.2013, at 05:13, Bharat Bhushan wrote: This patch adds the debug stub support on booke/bookehv. Now QEMU debug stub can use hw breakpoint, watchpoint and software breakpoint to debug guest. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/include/uapi/asm/kvm.h | 22 +- arch/powerpc/kvm/booke.c| 143 +++-- - arch/powerpc/kvm/e500_emulate.c |6 ++ arch/powerpc/kvm/e500mc.c |3 +- 4 files changed, 155 insertions(+), 19 deletions(-) diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h index 15f9a00..d7ce449 100644 --- a/arch/powerpc/include/uapi/asm/kvm.h +++ b/arch/powerpc/include/uapi/asm/kvm.h @@ -25,6 +25,7 @@ /* Select powerpc specific features in linux/kvm.h */ #define __KVM_HAVE_SPAPR_TCE #define __KVM_HAVE_PPC_SMT +#define __KVM_HAVE_GUEST_DEBUG struct kvm_regs { __u64 pc; @@ -267,7 +268,24 @@ struct kvm_fpu { __u64 fpr[32]; }; +/* + * Defines for h/w breakpoint, watchpoint (read, write or both) and + * software breakpoint. + * These are used as type in KVM_SET_GUEST_DEBUG ioctl and status + * for KVM_DEBUG_EXIT. + */ +#define KVMPPC_DEBUG_NONE0x0 +#define KVMPPC_DEBUG_BREAKPOINT (1UL 1) +#define KVMPPC_DEBUG_WATCH_WRITE (1UL 2) +#define KVMPPC_DEBUG_WATCH_READ (1UL 3) struct kvm_debug_exit_arch { + __u64 address; + /* + * exiting to userspace because of h/w breakpoint, watchpoint + * (read, write or both) and software breakpoint. + */ + __u32 status; + __u32 reserved; }; /* for KVM_SET_GUEST_DEBUG */ @@ -279,10 +297,6 @@ struct kvm_guest_debug_arch { * Type denotes h/w breakpoint, read watchpoint, write * watchpoint or watchpoint (both read and write). */ -#define KVMPPC_DEBUG_NOTYPE 0x0 -#define KVMPPC_DEBUG_BREAKPOINT (1UL 1) -#define KVMPPC_DEBUG_WATCH_WRITE (1UL 2) -#define KVMPPC_DEBUG_WATCH_READ (1UL 3) __u32 type; __u32 reserved; } bp[16]; diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 1de93a8..21b0313 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -133,6 +133,30 @@ static void kvmppc_vcpu_sync_fpu(struct kvm_vcpu *vcpu) #endif } +static void kvmppc_vcpu_sync_debug(struct kvm_vcpu *vcpu) { + /* Synchronize guest's desire to get debug interrupts into shadow +MSR */ #ifndef CONFIG_KVM_BOOKE_HV + vcpu-arch.shadow_msr = ~MSR_DE; + vcpu-arch.shadow_msr |= vcpu-arch.shared-msr MSR_DE; #endif + + /* Force enable debug interrupts when user space wants to debug */ + if (vcpu-guest_debug) { +#ifdef CONFIG_KVM_BOOKE_HV + /* + * Since there is no shadow MSR, sync MSR_DE into the guest + * visible MSR. Do not allow guest to change MSR[DE]. + */ + vcpu-arch.shared-msr |= MSR_DE; + mtspr(SPRN_MSRP, mfspr(SPRN_MSRP) | MSRP_DEP); #else + vcpu-arch.shadow_msr |= MSR_DE; + vcpu-arch.shared-msr = ~MSR_DE; #endif + } +} + /* * Helper function for full MSR writes. No need to call this if only * EE/CE/ME/DE/RI are changing. @@ -150,6 +174,7 @@ void kvmppc_set_msr(struct kvm_vcpu *vcpu, u32 new_msr) kvmppc_mmu_msr_notify(vcpu, old_msr); kvmppc_vcpu_sync_spe(vcpu); kvmppc_vcpu_sync_fpu(vcpu); + kvmppc_vcpu_sync_debug(vcpu); } static void kvmppc_booke_queue_irqprio(struct kvm_vcpu *vcpu, @@ -736,6 +761,13 @@ static int emulation_exit(struct kvm_run *run, struct kvm_vcpu *vcpu) run-exit_reason = KVM_EXIT_DCR; return RESUME_HOST; + case EMULATE_EXIT_USER: + run-exit_reason = KVM_EXIT_DEBUG; + run-debug.arch.address = vcpu-arch.pc; + run-debug.arch.status = 0; + kvmppc_account_exit(vcpu, DEBUG_EXITS); As mentioned previously, this is wrong and needs to go into the instruction emulation code for that opcode. ok + return RESUME_HOST; + case EMULATE_FAIL: printk(KERN_CRIT %s: emulation at %lx failed (%08x)\n
RE: [PATCH 4/7] booke: Save and restore debug registers on guest entry and exit
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Thursday, March 07, 2013 6:56 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 4/7] booke: Save and restore debug registers on guest entry and exit On 28.02.2013, at 05:13, Bharat Bhushan wrote: On Guest entry: if guest is wants to use the debug register then save h/w debug register in host_dbg_reg and load the debug registers with shadow_dbg_reg. Otherwise leave h/w debug registers as is. Why can't we switch the majority of registers on vcpu_put/get and only enable or disable debugging on guest entry/exit? One of the reason for not doing this is that the KVM is a host kernel module and let this be debugged by host (I do not this how much useful this is :)) So I am not able to recall the specific reason, maybe we have just coded this like this and tried to keep overhead as low as possible by switching registers only when they are used. My point is that the overhead is _higher_ this way, because we need to do checks and switches on every guest entry/exit, which happens a _lot_ more often than a host context switch. As we discussed before, we can keep this option open for future. What future? Just ignore debug events in the entry/exit code path and suddenly a lot of the code becomes a lot easier. Just to summarize what we agreed upon: - Save/restore will happen on vcpu_get()/vcpu_put(). This will happen only if guest is using debug registers. Probably using a flag to indicate guest is using debug APU. - On debug register access from QEMU, always set value in h/w debug register. - On guest access of debug register, also save xxx h/w register in vcpu-host_debug_reg.xxx and load guest provided value in h/w debug register, ensure this happen on first access only, probably all debug registers once debug events enabled in dbcr0. Direct access from guest was not part of this patchset and support for this will be done separately. Thanks -Bharat Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 7/7] KVM: PPC: Add userspace debug stub support
-Original Message- From: Wood Scott-B07421 Sent: Thursday, March 14, 2013 9:36 PM To: Bhushan Bharat-R65777 Cc: Alexander Graf; kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott- B07421 Subject: Re: [PATCH 7/7] KVM: PPC: Add userspace debug stub support On 03/14/2013 08:57:53 AM, Bhushan Bharat-R65777 wrote: diff --git a/arch/powerpc/kvm/e500mc.c b/arch/powerpc/kvm/e500mc.c index 1f89d26..f5fc6f5 100644 --- a/arch/powerpc/kvm/e500mc.c +++ b/arch/powerpc/kvm/e500mc.c @@ -182,8 +182,7 @@ int kvmppc_core_vcpu_setup(struct kvm_vcpu *vcpu) { struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu); - vcpu-arch.shadow_epcr = SPRN_EPCR_DSIGS | SPRN_EPCR_DGTMI | \ - SPRN_EPCR_DUVD; + vcpu-arch.shadow_epcr = SPRN_EPCR_DSIGS | SPRN_EPCR_DGTMI; Doesn't this route all debug events through the host? No; This means that debug events can occur in hypervisor state or not. EPCR.DUVD = 0 ; Debug events can occur in the hypervisor state. EPCR.DUVD = 1 ; Debug events cannot occur in the hypervisor state. So we allow debug events to occur in hypervisor state. Why do we care about debug events in our entry/exit code and didn't care about them before? We care for single stepping in guest to not step in KVM code. If anything, this is a completely separate patch, orthogonal to this patch series, and requires a good bit of explanation. Not sure why you think separate patch; this patch add support for single stepping and also takes care that debug event does not comes in host when doing single stepping. How does *removing* DUVD ensure that? By default we clear DUVD, so debug events can come in hypervisor state. But on lightweight exit, when restoring guest debug context, we set DUVD so the debug interrupt will not come in hypervisor state as debug resource are taken by guest. On guest exit, when restoring the host context we clear DUVD so now debug resource are having host context. With proposed change of save and restore on vcpu_get/vcpu_put this switching witching will be done in vcpu_get/set(). Thanks -Bharat -Scott -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 6/7] Rename EMULATE_DO_PAPR to EMULATE_EXIT_USER
-Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Thursday, March 07, 2013 4:17 PM To: Wood Scott-B07421 Cc: Bhushan Bharat-R65777; kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Bhushan Bharat-R65777 Subject: Re: [PATCH 6/7] Rename EMULATE_DO_PAPR to EMULATE_EXIT_USER On 28.02.2013, at 17:53, Scott Wood wrote: On 02/28/2013 10:51:10 AM, Alexander Graf wrote: On 28.02.2013, at 17:31, Scott Wood wrote: On 02/27/2013 10:13:15 PM, Bharat Bhushan wrote: Instruction emulation return EMULATE_DO_PAPR when it requires exit to userspace on book3s. Similar return is required for booke. EMULATE_DO_PAPR reads out to be confusing so it is renamed to EMULATE_EXIT_USER. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/include/asm/kvm_ppc.h |2 +- arch/powerpc/kvm/book3s_emulate.c |2 +- arch/powerpc/kvm/book3s_pr.c |2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index 44a657a..8b81468 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -44,7 +44,7 @@ enum emulation_result { EMULATE_DO_DCR, /* kvm_run filled with DCR request */ EMULATE_FAIL, /* can't emulate this instruction */ EMULATE_AGAIN,/* something went wrong. go again */ - EMULATE_DO_PAPR, /* kvm_run filled with PAPR request */ + EMULATE_EXIT_USER,/* emulation requires exit to user-space */ }; extern int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu); diff --git a/arch/powerpc/kvm/book3s_emulate.c b/arch/powerpc/kvm/book3s_emulate.c index 836c569..cdd19d6 100644 --- a/arch/powerpc/kvm/book3s_emulate.c +++ b/arch/powerpc/kvm/book3s_emulate.c @@ -194,7 +194,7 @@ int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu, run-papr_hcall.args[i] = gpr; } - emulated = EMULATE_DO_PAPR; + emulated = EMULATE_EXIT_USER; break; } #endif diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c index 73ed11c..8df2d2d 100644 --- a/arch/powerpc/kvm/book3s_pr.c +++ b/arch/powerpc/kvm/book3s_pr.c @@ -760,7 +760,7 @@ program_interrupt: run-exit_reason = KVM_EXIT_MMIO; r = RESUME_HOST_NV; break; - case EMULATE_DO_PAPR: + case EMULATE_EXIT_USER: run-exit_reason = KVM_EXIT_PAPR_HCALL; vcpu-arch.hcall_needed = 1; r = RESUME_HOST_NV; I don't think it makes sense to genericize this. It makes sense if the run-exit_reason = ... and hcall_needed = ... lines get pulled into the emulator. That would be fine. Bharat, did I miss a new patch version with that mess up there fixed? Do you mean moving run-exit_reason = ... and vcpu-arch.hcall_needed = ... into arch/powerpc/kvm/book3s_emulate.c ? If yes, then no you did not miss :) as I have not sent. I will send the new patch with other patches in the patch-set. -Bharat Alex -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 7/7] KVM: PPC: Add userspace debug stub support
-Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Thursday, March 14, 2013 5:20 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421 Subject: Re: [PATCH 7/7] KVM: PPC: Add userspace debug stub support On 14.03.2013, at 06:18, Bhushan Bharat-R65777 wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Thursday, March 07, 2013 7:09 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 7/7] KVM: PPC: Add userspace debug stub support On 28.02.2013, at 05:13, Bharat Bhushan wrote: This patch adds the debug stub support on booke/bookehv. Now QEMU debug stub can use hw breakpoint, watchpoint and software breakpoint to debug guest. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/include/uapi/asm/kvm.h | 22 +- arch/powerpc/kvm/booke.c| 143 +++-- - arch/powerpc/kvm/e500_emulate.c |6 ++ arch/powerpc/kvm/e500mc.c |3 +- 4 files changed, 155 insertions(+), 19 deletions(-) diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h index 15f9a00..d7ce449 100644 --- a/arch/powerpc/include/uapi/asm/kvm.h +++ b/arch/powerpc/include/uapi/asm/kvm.h @@ -25,6 +25,7 @@ /* Select powerpc specific features in linux/kvm.h */ #define __KVM_HAVE_SPAPR_TCE #define __KVM_HAVE_PPC_SMT +#define __KVM_HAVE_GUEST_DEBUG struct kvm_regs { __u64 pc; @@ -267,7 +268,24 @@ struct kvm_fpu { __u64 fpr[32]; }; +/* + * Defines for h/w breakpoint, watchpoint (read, write or both) and + * software breakpoint. + * These are used as type in KVM_SET_GUEST_DEBUG ioctl and status + * for KVM_DEBUG_EXIT. + */ +#define KVMPPC_DEBUG_NONE0x0 +#define KVMPPC_DEBUG_BREAKPOINT (1UL 1) +#define KVMPPC_DEBUG_WATCH_WRITE (1UL 2) +#define KVMPPC_DEBUG_WATCH_READ (1UL 3) struct kvm_debug_exit_arch { + __u64 address; + /* + * exiting to userspace because of h/w breakpoint, watchpoint + * (read, write or both) and software breakpoint. + */ + __u32 status; + __u32 reserved; }; /* for KVM_SET_GUEST_DEBUG */ @@ -279,10 +297,6 @@ struct kvm_guest_debug_arch { * Type denotes h/w breakpoint, read watchpoint, write * watchpoint or watchpoint (both read and write). */ -#define KVMPPC_DEBUG_NOTYPE 0x0 -#define KVMPPC_DEBUG_BREAKPOINT (1UL 1) -#define KVMPPC_DEBUG_WATCH_WRITE (1UL 2) -#define KVMPPC_DEBUG_WATCH_READ (1UL 3) __u32 type; __u32 reserved; } bp[16]; diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 1de93a8..21b0313 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -133,6 +133,30 @@ static void kvmppc_vcpu_sync_fpu(struct kvm_vcpu *vcpu) #endif } +static void kvmppc_vcpu_sync_debug(struct kvm_vcpu *vcpu) { + /* Synchronize guest's desire to get debug interrupts into shadow +MSR */ #ifndef CONFIG_KVM_BOOKE_HV + vcpu-arch.shadow_msr = ~MSR_DE; + vcpu-arch.shadow_msr |= vcpu-arch.shared-msr MSR_DE; #endif + + /* Force enable debug interrupts when user space wants to debug */ + if (vcpu-guest_debug) { +#ifdef CONFIG_KVM_BOOKE_HV + /* + * Since there is no shadow MSR, sync MSR_DE into the guest + * visible MSR. Do not allow guest to change MSR[DE]. + */ + vcpu-arch.shared-msr |= MSR_DE; + mtspr(SPRN_MSRP, mfspr(SPRN_MSRP) | MSRP_DEP); #else + vcpu-arch.shadow_msr |= MSR_DE; + vcpu-arch.shared-msr = ~MSR_DE; #endif + } +} + /* * Helper function for full MSR writes. No need to call this if only * EE/CE/ME/DE/RI are changing. @@ -150,6 +174,7 @@ void kvmppc_set_msr(struct kvm_vcpu *vcpu, u32 new_msr) kvmppc_mmu_msr_notify(vcpu, old_msr); kvmppc_vcpu_sync_spe(vcpu); kvmppc_vcpu_sync_fpu(vcpu); + kvmppc_vcpu_sync_debug(vcpu); } static void kvmppc_booke_queue_irqprio(struct kvm_vcpu *vcpu, @@ -736,6 +761,13 @@ static int emulation_exit(struct kvm_run *run, struct kvm_vcpu *vcpu) run-exit_reason = KVM_EXIT_DCR; return RESUME_HOST; + case EMULATE_EXIT_USER: + run-exit_reason = KVM_EXIT_DEBUG; + run-debug.arch.address = vcpu-arch.pc; + run-debug.arch.status = 0; + kvmppc_account_exit(vcpu, DEBUG_EXITS); As mentioned previously, this is wrong and needs to go into the instruction emulation code for that opcode. ok + return RESUME_HOST; + case EMULATE_FAIL: printk(KERN_CRIT %s: emulation at %lx failed (%08x)\n
RE: [PATCH 4/7] booke: Save and restore debug registers on guest entry and exit
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Thursday, March 07, 2013 6:56 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 4/7] booke: Save and restore debug registers on guest entry and exit On 28.02.2013, at 05:13, Bharat Bhushan wrote: On Guest entry: if guest is wants to use the debug register then save h/w debug register in host_dbg_reg and load the debug registers with shadow_dbg_reg. Otherwise leave h/w debug registers as is. Why can't we switch the majority of registers on vcpu_put/get and only enable or disable debugging on guest entry/exit? One of the reason for not doing this is that the KVM is a host kernel module and let this be debugged by host (I do not this how much useful this is :)) So I am not able to recall the specific reason, maybe we have just coded this like this and tried to keep overhead as low as possible by switching registers only when they are used. My point is that the overhead is _higher_ this way, because we need to do checks and switches on every guest entry/exit, which happens a _lot_ more often than a host context switch. As we discussed before, we can keep this option open for future. What future? Just ignore debug events in the entry/exit code path and suddenly a lot of the code becomes a lot easier. Just to summarize what we agreed upon: - Save/restore will happen on vcpu_get()/vcpu_put(). This will happen only if guest is using debug registers. Probably using a flag to indicate guest is using debug APU. - On debug register access from QEMU, always set value in h/w debug register. - On guest access of debug register, also save xxx h/w register in vcpu-host_debug_reg.xxx and load guest provided value in h/w debug register, ensure this happen on first access only, probably all debug registers once debug events enabled in dbcr0. Direct access from guest was not part of this patchset and support for this will be done separately. Thanks -Bharat Alex -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 7/7] KVM: PPC: Add userspace debug stub support
-Original Message- From: Wood Scott-B07421 Sent: Thursday, March 14, 2013 9:36 PM To: Bhushan Bharat-R65777 Cc: Alexander Graf; kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott- B07421 Subject: Re: [PATCH 7/7] KVM: PPC: Add userspace debug stub support On 03/14/2013 08:57:53 AM, Bhushan Bharat-R65777 wrote: diff --git a/arch/powerpc/kvm/e500mc.c b/arch/powerpc/kvm/e500mc.c index 1f89d26..f5fc6f5 100644 --- a/arch/powerpc/kvm/e500mc.c +++ b/arch/powerpc/kvm/e500mc.c @@ -182,8 +182,7 @@ int kvmppc_core_vcpu_setup(struct kvm_vcpu *vcpu) { struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu); - vcpu-arch.shadow_epcr = SPRN_EPCR_DSIGS | SPRN_EPCR_DGTMI | \ - SPRN_EPCR_DUVD; + vcpu-arch.shadow_epcr = SPRN_EPCR_DSIGS | SPRN_EPCR_DGTMI; Doesn't this route all debug events through the host? No; This means that debug events can occur in hypervisor state or not. EPCR.DUVD = 0 ; Debug events can occur in the hypervisor state. EPCR.DUVD = 1 ; Debug events cannot occur in the hypervisor state. So we allow debug events to occur in hypervisor state. Why do we care about debug events in our entry/exit code and didn't care about them before? We care for single stepping in guest to not step in KVM code. If anything, this is a completely separate patch, orthogonal to this patch series, and requires a good bit of explanation. Not sure why you think separate patch; this patch add support for single stepping and also takes care that debug event does not comes in host when doing single stepping. How does *removing* DUVD ensure that? By default we clear DUVD, so debug events can come in hypervisor state. But on lightweight exit, when restoring guest debug context, we set DUVD so the debug interrupt will not come in hypervisor state as debug resource are taken by guest. On guest exit, when restoring the host context we clear DUVD so now debug resource are having host context. With proposed change of save and restore on vcpu_get/vcpu_put this switching witching will be done in vcpu_get/set(). Thanks -Bharat -Scott -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/7] Added ONE_REG interface for debug instruction
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Thursday, March 07, 2013 6:38 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/7] Added ONE_REG interface for debug instruction On 28.02.2013, at 05:13, Bharat Bhushan wrote: This patch adds the one_reg interface to get the special instruction to be used for setting software breakpoint from userspace. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- Documentation/virtual/kvm/api.txt |1 + arch/powerpc/include/asm/kvm_book3s.h |1 + arch/powerpc/include/asm/kvm_booke.h |2 ++ arch/powerpc/include/uapi/asm/kvm.h |4 arch/powerpc/kvm/book3s.c |6 ++ arch/powerpc/kvm/booke.c |6 ++ 6 files changed, 20 insertions(+), 0 deletions(-) diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index cce500a..dbfcc04 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -1766,6 +1766,7 @@ registers, find a list below: PPC | KVM_REG_PPC_TSR | 32 PPC | KVM_REG_PPC_OR_TSR| 32 PPC | KVM_REG_PPC_CLEAR_TSR | 32 + PPC | KVM_REG_PPC_DEBUG_INST| 32 4.69 KVM_GET_ONE_REG diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h index 5a56e1c..36164cc 100644 --- a/arch/powerpc/include/asm/kvm_book3s.h +++ b/arch/powerpc/include/asm/kvm_book3s.h @@ -458,6 +458,7 @@ static inline bool kvmppc_critical_section(struct kvm_vcpu *vcpu) #define OSI_SC_MAGIC_R4 0x77810F9B #define INS_DCBZ0x7c0007ec +#define INS_TW 0x7c08 This one should be trap, so TO needs to be 31. The instruction as it's here is a nop if I read the spec correctly. Yes I missed this. BTW rather than setting TO = 31, what if we set TO = 2 as RA and RB is same here. -Bharat Alex /* LPIDs we support with this build -- runtime limit may be lower */ #define KVMPPC_NR_LPIDS (LPID_RSVD + 1) diff --git a/arch/powerpc/include/asm/kvm_booke.h b/arch/powerpc/include/asm/kvm_booke.h index b7cd335..d3c1eb3 100644 --- a/arch/powerpc/include/asm/kvm_booke.h +++ b/arch/powerpc/include/asm/kvm_booke.h @@ -26,6 +26,8 @@ /* LPIDs we support with this build -- runtime limit may be lower */ #define KVMPPC_NR_LPIDS64 +#define KVMPPC_INST_EHPRIV 0x7c00021c + static inline void kvmppc_set_gpr(struct kvm_vcpu *vcpu, int num, ulong val) { vcpu-arch.gpr[num] = val; diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h index ef072b1..c2ff99c 100644 --- a/arch/powerpc/include/uapi/asm/kvm.h +++ b/arch/powerpc/include/uapi/asm/kvm.h @@ -422,4 +422,8 @@ struct kvm_get_htab_header { #define KVM_REG_PPC_CLEAR_TSR (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x88) #define KVM_REG_PPC_TCR (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x89) #define KVM_REG_PPC_TSR (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x8a) + +/* Debugging: Special instruction for software breakpoint */ +#define KVM_REG_PPC_DEBUG_INST (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x8b) + #endif /* __LINUX_KVM_POWERPC_H */ diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c index a4b6452..975a401 100644 --- a/arch/powerpc/kvm/book3s.c +++ b/arch/powerpc/kvm/book3s.c @@ -530,6 +530,12 @@ int kvm_vcpu_ioctl_get_one_reg(struct kvm_vcpu *vcpu, struct kvm_one_reg *reg) val = get_reg_val(reg-id, vcpu-arch.vscr.u[3]); break; #endif /* CONFIG_ALTIVEC */ + case KVM_REG_PPC_DEBUG_INST: { + u32 opcode = INS_TW; + r = copy_to_user((u32 __user *)(long)reg-addr, +opcode, sizeof(u32)); + break; + } default: r = -EINVAL; break; diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 8b553c0..a41cd6d 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -1448,6 +1448,12 @@ int kvm_vcpu_ioctl_get_one_reg(struct kvm_vcpu *vcpu, struct kvm_one_reg *reg) case KVM_REG_PPC_TSR: r = put_user(vcpu-arch.tsr, (u32 __user *)(long)reg-addr); break; + case KVM_REG_PPC_DEBUG_INST: { + u32 opcode = KVMPPC_INST_EHPRIV; + r = copy_to_user((u32 __user *)(long)reg-addr, +opcode, sizeof(u32)); + break; + } default: break; } -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More
RE: [PATCH 3/7] KVM: PPC: debug stub interface parameter defined
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Thursday, March 07, 2013 6:51 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 3/7] KVM: PPC: debug stub interface parameter defined On 28.02.2013, at 05:13, Bharat Bhushan wrote: This patch defines the interface parameter for KVM_SET_GUEST_DEBUG ioctl support. Follow up patches will use this for setting up hardware breakpoints, watchpoints and software breakpoints. Also kvm_arch_vcpu_ioctl_set_guest_debug() is brought one level below. This is because I am not sure what is required for book3s. So this ioctl behaviour will not change for book3s. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/include/uapi/asm/kvm.h | 23 +++ arch/powerpc/kvm/book3s.c |6 ++ arch/powerpc/kvm/booke.c|6 ++ arch/powerpc/kvm/powerpc.c |6 -- 4 files changed, 35 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h index c2ff99c..15f9a00 100644 --- a/arch/powerpc/include/uapi/asm/kvm.h +++ b/arch/powerpc/include/uapi/asm/kvm.h @@ -272,8 +272,31 @@ struct kvm_debug_exit_arch { /* for KVM_SET_GUEST_DEBUG */ struct kvm_guest_debug_arch { + struct { + /* H/W breakpoint/watchpoint address */ + __u64 addr; + /* +* Type denotes h/w breakpoint, read watchpoint, write +* watchpoint or watchpoint (both read and write). +*/ +#define KVMPPC_DEBUG_NOTYPE0x0 +#define KVMPPC_DEBUG_BREAKPOINT(1UL 1) +#define KVMPPC_DEBUG_WATCH_WRITE (1UL 2) +#define KVMPPC_DEBUG_WATCH_READ(1UL 3) + __u32 type; + __u32 reserved; + } bp[16]; }; +/* Debug related defines */ +/* + * kvm_guest_debug-control is a 32 bit field. The lower 16 bits are +generic + * and upper 16 bits are architecture specific. Architecture specific +defines + * that ioctl is for setting hardware breakpoint or software breakpoint. + */ +#define KVM_GUESTDBG_USE_SW_BP 0x0001 +#define KVM_GUESTDBG_USE_HW_BP 0x0002 You only need #define KVM_GUESTDBG_HW_BP 0x0001 In absence of the flag, it's a SW breakpoint. We kept this for 2 reasons; 1) Same logic is applied for i386, so trying to keep consistent 2) better clarity. If you want than I can code this as you described. -Bharat Alex + /* definition of registers in kvm_run */ struct kvm_sync_regs { }; diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c index 975a401..cb85d73 100644 --- a/arch/powerpc/kvm/book3s.c +++ b/arch/powerpc/kvm/book3s.c @@ -613,6 +613,12 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu, return 0; } +int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu, + struct kvm_guest_debug *dbg) +{ + return -EINVAL; +} + void kvmppc_decrementer_func(unsigned long data) { struct kvm_vcpu *vcpu = (struct kvm_vcpu *)data; diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index a41cd6d..1de93a8 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -1527,6 +1527,12 @@ int kvm_vcpu_ioctl_set_one_reg(struct kvm_vcpu *vcpu, struct kvm_one_reg *reg) return r; } +int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu, +struct kvm_guest_debug *dbg) +{ + return -EINVAL; +} + int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu) { return -ENOTSUPP; diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c index 934413c..4c94ca9 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -532,12 +532,6 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu) #endif } -int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu, -struct kvm_guest_debug *dbg) -{ - return -EINVAL; -} - static void kvmppc_complete_dcr_load(struct kvm_vcpu *vcpu, struct kvm_run *run) { -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html