Re: KVM internal error. Suberror: 1, emulation failure

2013-07-18 Thread Paolo Bonzini
Il 17/07/2013 18:16, Dave Hansen ha scritto:
 I'm causing qemu to spew these emulation failure messages until I kill
 it.  The guest kernel being run has been hacked up pretty heavily and is
 probably either accessing bad physical addresses (above the address
 ranges in the e820 table) or trying to DMA to bad addresses.
 
 What I'd really like qemu to be doing is trapping back in to the guest
 kernel to have it handle this issue.  Then I'd have a better chance of
 dumping out some debugging information to see where I went wrong.

This is happening because the kernel is executing a PCMPEQB instruction
on an invalid memory address.  This instruction is not yet emulated by
KVM.  If you want QEMU to trap back to the guest kernel, you can add
emulation of the instruction to arch/x86/kvm/emulate.c.

If you do not really care about the guest doing something sane, you can
use a stub emulation function that is just return emulate_ud(ctxt).
That alone could be a good starting point to attach a kernel debugger to
the guest.

Paolo

 host kernel: 3.10
 guest kernel: Linus commit d2b4a64 + patches
 qemu: v1.4.0-2835-g6453a3a
 
  KVM internal error. Suberror: 1
  emulation failure
  RAX= RBX=013c0410 RCX=0010 
  RDX=0010
  RSI=000a RDI=7f6d256a73c0 RBP= 
  RSP=7fffe2720ce8
  R8 = R9 = R10=0022 
  R11=0246
  R12=7fffe2720d58 R13=0400 R14=7f6d256a7000 
  R15=
  RIP=7f6d24c5a50e RFL=00010202 [---] CPL=3 II=0 A20=1 SMM=0 HLT=0
  ES =   00c0
  CS =0033   00a0fb00 DPL=3 CS64 [-RA]
  SS =002b   00c0f300 DPL=3 DS   [-WA]
  DS =   00c0
  FS = 7f6d2569d740  00c0
  GS =   00c0
  LDT=   00c0
  TR =0040 88007b190480 2087 8b00 DPL=0 TSS64-busy
  GDT= 88007b184000 007f
  IDT= ff57a000 0fff
  CR0=80050033 CR2=7f6d256a7000 CR3=6f13b000 CR4=06e0
  DR0= DR1= DR2= 
  DR3= 
  DR6=0ff0 DR7=0400
  EFER=0d01
  Code=d7 c3 85 c0 0f 85 bc 00 00 00 48 83 ea 10 0f 8e d2 00 00 00 66 0f 
  74 4f 30 66 0f d7 c1 85 c0 0f 85 b1 00 00 00 48 31 c0 c3 66 66 66 66 2e 0f 
  1f 84 00
  KVM internal error. Suberror: 1
  emulation failure
  RAX= RBX=013c0410 RCX=0010 
  RDX=0010
  RSI=000a RDI=7f6d256a73c0 RBP= 
  RSP=7fffe2720ce8
  R8 = R9 = R10=0022 
  R11=0246
  R12=7fffe2720d58 R13=0400 R14=7f6d256a7000 
  R15=
  RIP=7f6d24c5a50e RFL=00010202 [---] CPL=3 II=0 A20=1 SMM=0 HLT=0
  ES =   00c0
  CS =0033   00a0fb00 DPL=3 CS64 [-RA]
  SS =002b   00c0f300 DPL=3 DS   [-WA]
  DS =   00c0
  FS = 7f6d2569d740  00c0
  GS =   00c0
  LDT=   00c0
  TR =0040 88007b190480 2087 8b00 DPL=0 TSS64-busy
  GDT= 88007b184000 007f
  IDT= ff57a000 0fff
  CR0=80050033 CR2=7f6d256a7000 CR3=6f13b000 CR4=06e0
  DR0= DR1= DR2= 
  DR3= 
  DR6=0ff0 DR7=0400
  EFER=0d01
  Code=d7 c3 85 c0 0f 85 bc 00 00 00 48 83 ea 10 0f 8e d2 00 00 00 66 0f 
  74 4f 30 66 0f d7 c1 85 c0 0f 85 b1 00 00 00 48 31 c0 c3 66 66 66 66 2e 0f 
  1f 84 00

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: MMU: avoid fast page fault fixing mmio page fault

2013-07-18 Thread Xiao Guangrong
On 07/18/2013 01:31 PM, Gleb Natapov wrote:
 On Thu, Jul 18, 2013 at 12:52:37PM +0800, Xiao Guangrong wrote:
 Currently, fast page fault tries to fix mmio page fault when the
 generation number is invalid (spte.gen != kvm.gen) and returns to
 guest to retry the fault since it sees the last spte is nonpresent
 which causes infinity loop

 It can be triggered only on AMD host since the mmio page fault is
 recognized as ept-misconfig

 We still call into regular page fault handler from ept-misconfig
 handler, but fake zero error_code we provide makes page_fault_can_be_fast()
 return false.

Yes.

 
 Shouldn't shadow paging trigger this too? I haven't encountered this on
 Intel without ept.

Since currently fast page fault only works for direct mmu. :)

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: MMU: avoid fast page fault fixing mmio page fault

2013-07-18 Thread Gleb Natapov
On Thu, Jul 18, 2013 at 02:01:47PM +0800, Xiao Guangrong wrote:
 On 07/18/2013 01:31 PM, Gleb Natapov wrote:
  On Thu, Jul 18, 2013 at 12:52:37PM +0800, Xiao Guangrong wrote:
  Currently, fast page fault tries to fix mmio page fault when the
  generation number is invalid (spte.gen != kvm.gen) and returns to
  guest to retry the fault since it sees the last spte is nonpresent
  which causes infinity loop
 
  It can be triggered only on AMD host since the mmio page fault is
  recognized as ept-misconfig
 
  We still call into regular page fault handler from ept-misconfig
  handler, but fake zero error_code we provide makes page_fault_can_be_fast()
  return false.
 
 Yes.
 
  
  Shouldn't shadow paging trigger this too? I haven't encountered this on
  Intel without ept.
 
 Since currently fast page fault only works for direct mmu. :)
Ah, yes. So with shadow page and paging disabled in a guest is can
happen eventually, but we do not trigger it for some reason?

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] kvm: powerpc: Do not ignore E attribute in mas2

2013-07-18 Thread Bharat Bhushan
Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 arch/powerpc/kvm/e500.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h
index c2e5e98..277cb18 100644
--- a/arch/powerpc/kvm/e500.h
+++ b/arch/powerpc/kvm/e500.h
@@ -117,7 +117,7 @@ static inline struct kvmppc_vcpu_e500 *to_e500(struct 
kvm_vcpu *vcpu)
 #define E500_TLB_USER_PERM_MASK (MAS3_UX|MAS3_UR|MAS3_UW)
 #define E500_TLB_SUPER_PERM_MASK (MAS3_SX|MAS3_SR|MAS3_SW)
 #define MAS2_ATTRIB_MASK \
- (MAS2_X0 | MAS2_X1)
+ (MAS2_X0 | MAS2_X1 | MAS2_E)
 #define MAS3_ATTRIB_MASK \
  (MAS3_U0 | MAS3_U1 | MAS3_U2 | MAS3_U3 \
   | E500_TLB_USER_PERM_MASK | E500_TLB_SUPER_PERM_MASK)
-- 
1.7.0.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Bharat Bhushan
If there is a struct page for the requested mapping then it's
normal DDR and the mapping sets M bit (coherent, cacheable)
else this is treated as I/O and we set  I + G  (cache inhibited, guarded)

This helps setting proper TLB mapping for direct assigned device

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 arch/powerpc/kvm/e500_mmu_host.c |   17 -
 1 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
index 1c6a9d7..089c227 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int 
usermode)
return mas3;
 }
 
-static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
+static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
 {
+   u32 mas2_attr;
+
+   mas2_attr = mas2  MAS2_ATTRIB_MASK;
+
+   if (!pfn_valid(pfn)) {
+   mas2_attr |= MAS2_I | MAS2_G;
+   } else {
 #ifdef CONFIG_SMP
-   return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
-#else
-   return mas2  MAS2_ATTRIB_MASK;
+   mas2_attr |= MAS2_M;
 #endif
+   }
+   return mas2_attr;
 }
 
 /*
@@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe(
/* Force IPROT=0 for all guest mappings. */
stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID;
stlbe-mas2 = (gvaddr  MAS2_EPN) |
- e500_shadow_mas2_attrib(gtlbe-mas2, pr);
+ e500_shadow_mas2_attrib(gtlbe-mas2, pfn);
stlbe-mas7_3 = ((u64)pfn  PAGE_SHIFT) |
e500_shadow_mas3_attrib(gtlbe-mas7_3, pr);
 
-- 
1.7.0.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread “tiejun.chen”

On 07/18/2013 02:04 PM, Bharat Bhushan wrote:

If there is a struct page for the requested mapping then it's
normal DDR and the mapping sets M bit (coherent, cacheable)
else this is treated as I/O and we set  I + G  (cache inhibited, guarded)

This helps setting proper TLB mapping for direct assigned device

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
  arch/powerpc/kvm/e500_mmu_host.c |   17 -
  1 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
index 1c6a9d7..089c227 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int 
usermode)
return mas3;
  }

-static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
+static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
  {
+   u32 mas2_attr;
+
+   mas2_attr = mas2  MAS2_ATTRIB_MASK;
+
+   if (!pfn_valid(pfn)) {


Why not directly use kvm_is_mmio_pfn()?

Tiejun


+   mas2_attr |= MAS2_I | MAS2_G;
+   } else {
  #ifdef CONFIG_SMP
-   return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
-#else
-   return mas2  MAS2_ATTRIB_MASK;
+   mas2_attr |= MAS2_M;
  #endif
+   }
+   return mas2_attr;
  }

  /*
@@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe(
/* Force IPROT=0 for all guest mappings. */
stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID;
stlbe-mas2 = (gvaddr  MAS2_EPN) |
- e500_shadow_mas2_attrib(gtlbe-mas2, pr);
+ e500_shadow_mas2_attrib(gtlbe-mas2, pfn);
stlbe-mas7_3 = ((u64)pfn  PAGE_SHIFT) |
e500_shadow_mas3_attrib(gtlbe-mas7_3, pr);




--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: MMU: avoid fast page fault fixing mmio page fault

2013-07-18 Thread Xiao Guangrong
On 07/18/2013 02:06 PM, Gleb Natapov wrote:
 On Thu, Jul 18, 2013 at 02:01:47PM +0800, Xiao Guangrong wrote:
 On 07/18/2013 01:31 PM, Gleb Natapov wrote:
 On Thu, Jul 18, 2013 at 12:52:37PM +0800, Xiao Guangrong wrote:
 Currently, fast page fault tries to fix mmio page fault when the
 generation number is invalid (spte.gen != kvm.gen) and returns to
 guest to retry the fault since it sees the last spte is nonpresent
 which causes infinity loop

 It can be triggered only on AMD host since the mmio page fault is
 recognized as ept-misconfig

 We still call into regular page fault handler from ept-misconfig
 handler, but fake zero error_code we provide makes page_fault_can_be_fast()
 return false.

 Yes.


 Shouldn't shadow paging trigger this too? I haven't encountered this on
 Intel without ept.

 Since currently fast page fault only works for direct mmu. :)
 Ah, yes. So with shadow page and paging disabled in a guest is can
 happen eventually, but we do not trigger it for some reason?

Yes. I guess so, paging disable is short-lived and the sptes will be
invalid after memslot changed for 150 times, so it is hard to be triggered.

I should update this to the changelog, thanks for your reminder, Gleb.

==
[PATCH] KVM: MMU: avoid fast page fault fixing mmio page fault

Currently, fast page fault tries to fix mmio page fault when the
generation number is invalid (spte.gen != kvm.gen) and returns to
guest to retry the fault since it sees the last spte is nonpresent.
It causes infinity loop

Since fast page fault only works for direct mmu, the issue exists when
1) tdp is enabled. It is only triggered only on AMD host since on Intel host
   the mmio page fault is recognized as ept-misconfig whose handler call
   fault-page path with error_code = 0

2) guest paging is disabled. Under this case, the issue is hardly discovered
   since paging disable is short-lived and the sptes will be invalid after
   memslot changed for 150 times

Fix it by filtering the mmio page fault out in page_fault_can_be_fast

Reported-by: Markus Trippelsdorf mar...@trippelsdorf.de
Tested-by: Markus Trippelsdorf mar...@trippelsdorf.de
Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
---
 arch/x86/kvm/mmu.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index bf7af1e..3a9493a 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2811,6 +2811,13 @@ exit:
 static bool page_fault_can_be_fast(struct kvm_vcpu *vcpu, u32 error_code)
 {
/*
+* Do not fix the mmio spte with invalid generation number which
+* need to be updated by slow page fault path.
+*/
+   if (unlikely(error_code  PFERR_RSVD_MASK))
+   return false;
+
+   /*
 * #PF can be fast only if the shadow page table is present and it
 * is caused by write-protect, that means we just need change the
 * W bit of the spte which can be done out of mmu-lock.
-- 
1.8.1.4




--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: MMU: avoid fast page fault fixing mmio page fault

2013-07-18 Thread Gleb Natapov
On Thu, Jul 18, 2013 at 02:25:19PM +0800, Xiao Guangrong wrote:
 On 07/18/2013 02:06 PM, Gleb Natapov wrote:
  On Thu, Jul 18, 2013 at 02:01:47PM +0800, Xiao Guangrong wrote:
  On 07/18/2013 01:31 PM, Gleb Natapov wrote:
  On Thu, Jul 18, 2013 at 12:52:37PM +0800, Xiao Guangrong wrote:
  Currently, fast page fault tries to fix mmio page fault when the
  generation number is invalid (spte.gen != kvm.gen) and returns to
  guest to retry the fault since it sees the last spte is nonpresent
  which causes infinity loop
 
  It can be triggered only on AMD host since the mmio page fault is
  recognized as ept-misconfig
 
  We still call into regular page fault handler from ept-misconfig
  handler, but fake zero error_code we provide makes 
  page_fault_can_be_fast()
  return false.
 
  Yes.
 
 
  Shouldn't shadow paging trigger this too? I haven't encountered this on
  Intel without ept.
 
  Since currently fast page fault only works for direct mmu. :)
  Ah, yes. So with shadow page and paging disabled in a guest is can
  happen eventually, but we do not trigger it for some reason?
 
 Yes. I guess so, paging disable is short-lived and the sptes will be
 invalid after memslot changed for 150 times, so it is hard to be triggered.
 
 I should update this to the changelog, thanks for your reminder, Gleb.
 
 ==
 [PATCH] KVM: MMU: avoid fast page fault fixing mmio page fault
 
 Currently, fast page fault tries to fix mmio page fault when the
 generation number is invalid (spte.gen != kvm.gen) and returns to
 guest to retry the fault since it sees the last spte is nonpresent.
 It causes infinity loop
 
 Since fast page fault only works for direct mmu, the issue exists when
 1) tdp is enabled. It is only triggered only on AMD host since on Intel host
the mmio page fault is recognized as ept-misconfig whose handler call
fault-page path with error_code = 0
 
 2) guest paging is disabled. Under this case, the issue is hardly discovered
since paging disable is short-lived and the sptes will be invalid after
memslot changed for 150 times
 
 Fix it by filtering the mmio page fault out in page_fault_can_be_fast
 
 Reported-by: Markus Trippelsdorf mar...@trippelsdorf.de
 Tested-by: Markus Trippelsdorf mar...@trippelsdorf.de
 Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
Reviewed-by: Gleb Natapov g...@redhat.com


 ---
  arch/x86/kvm/mmu.c | 7 +++
  1 file changed, 7 insertions(+)
 
 diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
 index bf7af1e..3a9493a 100644
 --- a/arch/x86/kvm/mmu.c
 +++ b/arch/x86/kvm/mmu.c
 @@ -2811,6 +2811,13 @@ exit:
  static bool page_fault_can_be_fast(struct kvm_vcpu *vcpu, u32 error_code)
  {
   /*
 +  * Do not fix the mmio spte with invalid generation number which
 +  * need to be updated by slow page fault path.
 +  */
 + if (unlikely(error_code  PFERR_RSVD_MASK))
 + return false;
 +
 + /*
* #PF can be fast only if the shadow page table is present and it
* is caused by write-protect, that means we just need change the
* W bit of the spte which can be done out of mmu-lock.
 -- 
 1.8.1.4
 
 
 

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM internal error. Suberror: 1, emulation failure

2013-07-18 Thread Gleb Natapov
On Thu, Jul 18, 2013 at 07:58:31AM +0200, Paolo Bonzini wrote:
 Il 17/07/2013 18:16, Dave Hansen ha scritto:
  I'm causing qemu to spew these emulation failure messages until I kill
  it.  The guest kernel being run has been hacked up pretty heavily and is
  probably either accessing bad physical addresses (above the address
  ranges in the e820 table) or trying to DMA to bad addresses.
  
  What I'd really like qemu to be doing is trapping back in to the guest
  kernel to have it handle this issue.  Then I'd have a better chance of
  dumping out some debugging information to see where I went wrong.
 
 This is happening because the kernel is executing a PCMPEQB instruction
 on an invalid memory address.  This instruction is not yet emulated by
 KVM.  If you want QEMU to trap back to the guest kernel, you can add
 emulation of the instruction to arch/x86/kvm/emulate.c.
 
 If you do not really care about the guest doing something sane, you can
 use a stub emulation function that is just return emulate_ud(ctxt).
 That alone could be a good starting point to attach a kernel debugger to
 the guest.
 
This is the behaviour that he currently gets (assuming there is no bug
somewhere, run ftrace to check), see my other reply. Not sure what he does
in his #UD handler that emulation error reappear. Restart offending process?

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: windows 2008 falling asleep under KVM

2013-07-18 Thread Nikola Ciprich
Hi,

just to let You know, with realtek instead of virtio-net, it seems
to be reachable for almost three days.. I'd give it another day, then we
can be pretty sure it's virtio related...

BR

nik

-- 
-
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28.rijna 168, 709 00 Ostrava

tel.:   +420 591 166 214
fax:+420 596 621 273
mobil:  +420 777 093 799
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-


pgpRgZflQSLxE.pgp
Description: PGP signature


Re: windows 2008 falling asleep under KVM

2013-07-18 Thread Gleb Natapov
On Thu, Jul 18, 2013 at 08:55:25AM +0200, Nikola Ciprich wrote:
 Hi,
 
 just to let You know, with realtek instead of virtio-net, it seems
 to be reachable for almost three days.. I'd give it another day, then we
 can be pretty sure it's virtio related...
 
Thanks. Interesting. Copying virtio-net expert.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Bhushan Bharat-R65777


 -Original Message-
 From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
 Sent: Thursday, July 18, 2013 11:56 AM
 To: Bhushan Bharat-R65777
 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott-
 B07421; Bhushan Bharat-R65777
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
 managed pages
 
 On 07/18/2013 02:04 PM, Bharat Bhushan wrote:
  If there is a struct page for the requested mapping then it's normal
  DDR and the mapping sets M bit (coherent, cacheable) else this is
  treated as I/O and we set  I + G  (cache inhibited, guarded)
 
  This helps setting proper TLB mapping for direct assigned device
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
arch/powerpc/kvm/e500_mmu_host.c |   17 -
1 files changed, 12 insertions(+), 5 deletions(-)
 
  diff --git a/arch/powerpc/kvm/e500_mmu_host.c
  b/arch/powerpc/kvm/e500_mmu_host.c
  index 1c6a9d7..089c227 100644
  --- a/arch/powerpc/kvm/e500_mmu_host.c
  +++ b/arch/powerpc/kvm/e500_mmu_host.c
  @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int
 usermode)
  return mas3;
}
 
  -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
  +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
{
  +   u32 mas2_attr;
  +
  +   mas2_attr = mas2  MAS2_ATTRIB_MASK;
  +
  +   if (!pfn_valid(pfn)) {
 
 Why not directly use kvm_is_mmio_pfn()?

What I understand from this function (someone can correct me) is that it 
returns false when the page is managed by kernel and is not marked as 
RESERVED (for some reason). For us it does not matter whether the page is 
reserved or not, if it is kernel visible page then it is DDR.

-Bharat

 
 Tiejun
 
  +   mas2_attr |= MAS2_I | MAS2_G;
  +   } else {
#ifdef CONFIG_SMP
  -   return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
  -#else
  -   return mas2  MAS2_ATTRIB_MASK;
  +   mas2_attr |= MAS2_M;
#endif
  +   }
  +   return mas2_attr;
}
 
/*
  @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe(
  /* Force IPROT=0 for all guest mappings. */
  stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID;
  stlbe-mas2 = (gvaddr  MAS2_EPN) |
  - e500_shadow_mas2_attrib(gtlbe-mas2, pr);
  + e500_shadow_mas2_attrib(gtlbe-mas2, pfn);
  stlbe-mas7_3 = ((u64)pfn  PAGE_SHIFT) |
  e500_shadow_mas3_attrib(gtlbe-mas7_3, pr);
 
 
 

N�r��yb�X��ǧv�^�)޺{.n�+h����ܨ}���Ơz�j:+v���zZ+��+zf���h���~i���z��w���?��)ߢf

Re: [RFC PATCH] kvm-unit-tests : Basic architecture of VMX nested test case

2013-07-18 Thread Gleb Natapov
On Thu, Jul 18, 2013 at 07:52:21AM +0200, Paolo Bonzini wrote:
 Il 17/07/2013 20:54, Arthur Chunqi Li ha scritto:
  +   .globl entry_sysenter\n\t
  +   entry_sysenter:\n\t
  +   SAVE_GPR
  +  and $0xf, %rax\n\t
  +  push%rax\n\t
 
 push should be wrong here, the first argument is in %rdi.
 
  +  callsyscall_handler\n\t
  +   LOAD_GPR
  +  vmresume\n\t
  +);
  +
  +int exit_handler()
 
 This (and syscall_handler too) needs __attribute__((__used__)) because
 it's only used from assembly.
 
That was my question actually, why is it used from assembly? Calling it
from see should not be a problem.

 Please add static to all functions except main, it triggers better
 compiler optimization and warnings and it will avoid nasty surprises in
 the future if the compiler becomes smarter.
 
  +{
  +   int ret;
  +
  +   current-exits++;
  +   current-guest_regs = regs;
  +   if (is_hypercall())
  +   ret = handle_hypercall();
  +   else
  +   ret = current-exit_handler();
  +   regs = current-guest_regs;
  +   switch (ret) {
  +   case VMX_TEST_VMEXIT:
  +   case VMX_TEST_RESUME:
  +   return ret;
  +   case VMX_TEST_EXIT:
  +   break;
  +   default:
  +   printf(ERROR : Invalid exit_handler return val %d.\n
  +   , ret);
  +   }
 
 Here have to propagate the exit codes through multiple levels, because
 we're not using setjmp/longjmp.  Not a big deal.  My worries with this
 version are below.
 
  +   print_vmexit_info();
  +   exit(-1);
  +   return 0;
  +}
  +
  +int vmx_run()
  +{
  +   bool ret;
  +   u32 eax;
  +   u64 rsp;
  +
  +   asm volatile(mov %%rsp, %0\n\t : =r(rsp));
  +   vmcs_write(HOST_RSP, rsp);
  +   asm volatile(vmlaunch;seta %0\n\t : =m(ret));
 
 setbe (this path is probably untested because it never happens in practice).
 
At least one of the tests should set up wrong vmcs state to verify that
nested vmx handles it correctly. In fact one of the patches that Arthur
have sent to nested vmx fixes exactly that code path.

 Also, missing memory clobber.
 
  +   if (ret) {
  +   printf(%s : vmlaunch failed.\n, __func__);
  +   return 1;
  +   }
  +   while (1) {
  +   asm volatile(
  +   LOAD_GPR_C
  +   vmresume;seta %0\n\t
  +   : =m(ret));
 
 setbe and missing memory clobber.
 
  +   if (ret) {
  +   printf(%s : vmresume failed.\n, __func__);
  +   return 1;
  +   }
  +   asm volatile(
  +   .global vmx_return\n\t
 
 .global should not be needed.
 
  +   vmx_return:\n\t
  +   SAVE_GPR_C
  +   call exit_handler\n\t
  +   mov %%eax, %0\n\t
  +   : =r(eax)
  +   );
 
 I had written a long explanation here about why I don't trust the
 compiler to do the right thing, and ideas about how to fix that.  But in
 the end the only workable solution is a single assembly blob like vmx.c
 in KVM to do vmlaunch/vmresume, so I'll get right to the point:
 
*  the similarity with KVM code and as little assembly as  *
*  possible goals are mutually exclusive *
 
I never said that code should be similar to KVM code or have as little
assembly as possible :) Reread the thread. The only thing I asked for
is to make code flow linear, if it makes code looks more like KVM or
reduce amount of assembly this is just a bonus.

 and for a testsuite I'd prefer the latter---which means I'd still favor
 setjmp/longjmp.
 
 Now, here is the long explanation.
 
 I must admit that the code looks nice.  There are some nits I'd like to
 see done differently (such as putting vmx_return at the beginning of the
 while (1), and the vmresume asm at the end), but it is indeed nice.
 
Why do you prefer setjmp/longjmp then?

 However, it is also very deceiving, because the processor is not doing
 what the code says.  If I see:
 
vmlaunch();
exit(-1);
 
 it is clear that magic happens in vmlaunch().  If I see
 
asm volatile(vmlaunch;setbe %0\n\t : =m(ret));
if (ret) {
...
}
asm(vmx_return:)
 
 it is absolutely not clear that the setbe and if are skipped on a
 successful vmlaunch.  So, at the very least you need a comment before
 those if (ret) to document the control flow, or you can use a jmp like
 this:
 
asm volatile goto (vmlaunch;jmp %0\n\t
   : : : memory : vmlaunch_error);
if (0) {
vmlaunch_error: ...
}
 
 The unconditional jump and asm goto make it clear that magic happens.
Agree, I dislike this magic too, but this is fixed by you suggestion
above about putting vmx_return at the beginning of while(). So the logic
will looks like that:

asm volatile(vmlaunch;setbe %0\n\t : =m(ret));
while(ret) {
   vmx_return:
   SAVE_GPR_C
   eax = exit_handler();
   switch(eax) {
   }
  

Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread “tiejun.chen”

On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:




-Original Message-
From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
Sent: Thursday, July 18, 2013 11:56 AM
To: Bhushan Bharat-R65777
Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott-
B07421; Bhushan Bharat-R65777
Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
managed pages

On 07/18/2013 02:04 PM, Bharat Bhushan wrote:

If there is a struct page for the requested mapping then it's normal
DDR and the mapping sets M bit (coherent, cacheable) else this is
treated as I/O and we set  I + G  (cache inhibited, guarded)

This helps setting proper TLB mapping for direct assigned device

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
   arch/powerpc/kvm/e500_mmu_host.c |   17 -
   1 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/e500_mmu_host.c
b/arch/powerpc/kvm/e500_mmu_host.c
index 1c6a9d7..089c227 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int

usermode)

return mas3;
   }

-static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
+static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
   {
+   u32 mas2_attr;
+
+   mas2_attr = mas2  MAS2_ATTRIB_MASK;
+
+   if (!pfn_valid(pfn)) {


Why not directly use kvm_is_mmio_pfn()?


What I understand from this function (someone can correct me) is that it returns 
false when the page is managed by kernel and is not marked as RESERVED (for 
some reason). For us it does not matter whether the page is reserved or not, if it is 
kernel visible page then it is DDR.



I think you are setting I|G by addressing all mmio pages, right? If so,

KVM: direct mmio pfn check

Userspace may specify memory slots that are backed by mmio pages rather than
normal RAM.  In some cases it is not enough to identify these mmio pages
by pfn_valid().  This patch adds checking the PageReserved as well.

Tiejun


-Bharat



Tiejun


+   mas2_attr |= MAS2_I | MAS2_G;
+   } else {
   #ifdef CONFIG_SMP
-   return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
-#else
-   return mas2  MAS2_ATTRIB_MASK;
+   mas2_attr |= MAS2_M;
   #endif
+   }
+   return mas2_attr;
   }

   /*
@@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe(
/* Force IPROT=0 for all guest mappings. */
stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID;
stlbe-mas2 = (gvaddr  MAS2_EPN) |
- e500_shadow_mas2_attrib(gtlbe-mas2, pr);
+ e500_shadow_mas2_attrib(gtlbe-mas2, pfn);
stlbe-mas7_3 = ((u64)pfn  PAGE_SHIFT) |
e500_shadow_mas3_attrib(gtlbe-mas7_3, pr);








--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Bhushan Bharat-R65777


 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On
 Behalf Of “tiejun.chen”
 Sent: Thursday, July 18, 2013 1:01 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott-
 B07421
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
 managed pages
 
 On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:
 
 
  -Original Message-
  From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
  Sent: Thursday, July 18, 2013 11:56 AM
  To: Bhushan Bharat-R65777
  Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood
  Scott- B07421; Bhushan Bharat-R65777
  Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
  kernel managed pages
 
  On 07/18/2013 02:04 PM, Bharat Bhushan wrote:
  If there is a struct page for the requested mapping then it's normal
  DDR and the mapping sets M bit (coherent, cacheable) else this is
  treated as I/O and we set  I + G  (cache inhibited, guarded)
 
  This helps setting proper TLB mapping for direct assigned device
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
 arch/powerpc/kvm/e500_mmu_host.c |   17 -
 1 files changed, 12 insertions(+), 5 deletions(-)
 
  diff --git a/arch/powerpc/kvm/e500_mmu_host.c
  b/arch/powerpc/kvm/e500_mmu_host.c
  index 1c6a9d7..089c227 100644
  --- a/arch/powerpc/kvm/e500_mmu_host.c
  +++ b/arch/powerpc/kvm/e500_mmu_host.c
  @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32
  mas3, int
  usermode)
return mas3;
 }
 
  -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
  +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
 {
  + u32 mas2_attr;
  +
  + mas2_attr = mas2  MAS2_ATTRIB_MASK;
  +
  + if (!pfn_valid(pfn)) {
 
  Why not directly use kvm_is_mmio_pfn()?
 
  What I understand from this function (someone can correct me) is that it
 returns false when the page is managed by kernel and is not marked as 
 RESERVED
 (for some reason). For us it does not matter whether the page is reserved or
 not, if it is kernel visible page then it is DDR.
 
 
 I think you are setting I|G by addressing all mmio pages, right? If so,
 
  KVM: direct mmio pfn check
 
  Userspace may specify memory slots that are backed by mmio pages rather
 than
  normal RAM.  In some cases it is not enough to identify these mmio pages
  by pfn_valid().  This patch adds checking the PageReserved as well.

Do you know what are those some cases and how checking PageReserved helps in 
those cases?

-Bharat

 
 Tiejun
 
  -Bharat
 
 
  Tiejun
 
  + mas2_attr |= MAS2_I | MAS2_G;
  + } else {
 #ifdef CONFIG_SMP
  - return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
  -#else
  - return mas2  MAS2_ATTRIB_MASK;
  + mas2_attr |= MAS2_M;
 #endif
  + }
  + return mas2_attr;
 }
 
 /*
  @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe(
/* Force IPROT=0 for all guest mappings. */
stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | 
  MAS1_VALID;
stlbe-mas2 = (gvaddr  MAS2_EPN) |
  -   e500_shadow_mas2_attrib(gtlbe-mas2, pr);
  +   e500_shadow_mas2_attrib(gtlbe-mas2, pfn);
stlbe-mas7_3 = ((u64)pfn  PAGE_SHIFT) |
e500_shadow_mas3_attrib(gtlbe-mas7_3, pr);
 
 
 
 
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body
 of a message to majord...@vger.kernel.org More majordomo info at
 http://vger.kernel.org/majordomo-info.html

N�r��yb�X��ǧv�^�)޺{.n�+h����ܨ}���Ơz�j:+v���zZ+��+zf���h���~i���z��w���?��)ߢf

Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread “tiejun.chen”

On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote:




-Original Message-
From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On
Behalf Of “tiejun.chen”
Sent: Thursday, July 18, 2013 1:01 PM
To: Bhushan Bharat-R65777
Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott-
B07421
Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
managed pages

On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:




-Original Message-
From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
Sent: Thursday, July 18, 2013 11:56 AM
To: Bhushan Bharat-R65777
Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood
Scott- B07421; Bhushan Bharat-R65777
Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
kernel managed pages

On 07/18/2013 02:04 PM, Bharat Bhushan wrote:

If there is a struct page for the requested mapping then it's normal
DDR and the mapping sets M bit (coherent, cacheable) else this is
treated as I/O and we set  I + G  (cache inhibited, guarded)

This helps setting proper TLB mapping for direct assigned device

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
arch/powerpc/kvm/e500_mmu_host.c |   17 -
1 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/e500_mmu_host.c
b/arch/powerpc/kvm/e500_mmu_host.c
index 1c6a9d7..089c227 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32
mas3, int

usermode)

return mas3;
}

-static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
+static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
{
+   u32 mas2_attr;
+
+   mas2_attr = mas2  MAS2_ATTRIB_MASK;
+
+   if (!pfn_valid(pfn)) {


Why not directly use kvm_is_mmio_pfn()?


What I understand from this function (someone can correct me) is that it

returns false when the page is managed by kernel and is not marked as RESERVED
(for some reason). For us it does not matter whether the page is reserved or
not, if it is kernel visible page then it is DDR.




I think you are setting I|G by addressing all mmio pages, right? If so,

  KVM: direct mmio pfn check

  Userspace may specify memory slots that are backed by mmio pages rather
than
  normal RAM.  In some cases it is not enough to identify these mmio pages
  by pfn_valid().  This patch adds checking the PageReserved as well.


Do you know what are those some cases and how checking PageReserved helps in 
those cases?


No, myself didn't see these actual cases in qemu,too. But this should be 
chronically persistent as I understand ;-)


Tiejun



-Bharat



Tiejun


-Bharat



Tiejun


+   mas2_attr |= MAS2_I | MAS2_G;
+   } else {
#ifdef CONFIG_SMP
-   return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
-#else
-   return mas2  MAS2_ATTRIB_MASK;
+   mas2_attr |= MAS2_M;
#endif
+   }
+   return mas2_attr;
}

/*
@@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe(
/* Force IPROT=0 for all guest mappings. */
stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID;
stlbe-mas2 = (gvaddr  MAS2_EPN) |
- e500_shadow_mas2_attrib(gtlbe-mas2, pr);
+ e500_shadow_mas2_attrib(gtlbe-mas2, pfn);
stlbe-mas7_3 = ((u64)pfn  PAGE_SHIFT) |
e500_shadow_mas3_attrib(gtlbe-mas7_3, pr);








--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body
of a message to majord...@vger.kernel.org More majordomo info at
http://vger.kernel.org/majordomo-info.html




--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Bhushan Bharat-R65777


 -Original Message-
 From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
 Sent: Thursday, July 18, 2013 1:52 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott-
 B07421
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
 managed pages
 
 On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote:
 
 
  -Original Message-
  From: kvm-ppc-ow...@vger.kernel.org
  [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen”
  Sent: Thursday, July 18, 2013 1:01 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood
  Scott-
  B07421
  Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
  kernel managed pages
 
  On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:
 
 
  -Original Message-
  From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
  Sent: Thursday, July 18, 2013 11:56 AM
  To: Bhushan Bharat-R65777
  Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de;
  Wood
  Scott- B07421; Bhushan Bharat-R65777
  Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
  kernel managed pages
 
  On 07/18/2013 02:04 PM, Bharat Bhushan wrote:
  If there is a struct page for the requested mapping then it's
  normal DDR and the mapping sets M bit (coherent, cacheable) else
  this is treated as I/O and we set  I + G  (cache inhibited,
  guarded)
 
  This helps setting proper TLB mapping for direct assigned device
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  arch/powerpc/kvm/e500_mmu_host.c |   17 -
  1 files changed, 12 insertions(+), 5 deletions(-)
 
  diff --git a/arch/powerpc/kvm/e500_mmu_host.c
  b/arch/powerpc/kvm/e500_mmu_host.c
  index 1c6a9d7..089c227 100644
  --- a/arch/powerpc/kvm/e500_mmu_host.c
  +++ b/arch/powerpc/kvm/e500_mmu_host.c
  @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32
  mas3, int
  usermode)
  return mas3;
  }
 
  -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
  +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
  {
  +   u32 mas2_attr;
  +
  +   mas2_attr = mas2  MAS2_ATTRIB_MASK;
  +
  +   if (!pfn_valid(pfn)) {
 
  Why not directly use kvm_is_mmio_pfn()?
 
  What I understand from this function (someone can correct me) is
  that it
  returns false when the page is managed by kernel and is not marked
  as RESERVED (for some reason). For us it does not matter whether the
  page is reserved or not, if it is kernel visible page then it is DDR.
 
 
  I think you are setting I|G by addressing all mmio pages, right? If
  so,
 
KVM: direct mmio pfn check
 
Userspace may specify memory slots that are backed by mmio
  pages rather than
normal RAM.  In some cases it is not enough to identify these mmio
 pages
by pfn_valid().  This patch adds checking the PageReserved as well.
 
  Do you know what are those some cases and how checking PageReserved helps 
  in
 those cases?
 
 No, myself didn't see these actual cases in qemu,too. But this should be
 chronically persistent as I understand ;-)

Then I will wait till someone educate me :)

-Bharat

 
 Tiejun
 
 
  -Bharat
 
 
  Tiejun
 
  -Bharat
 
 
  Tiejun
 
  +   mas2_attr |= MAS2_I | MAS2_G;
  +   } else {
  #ifdef CONFIG_SMP
  -   return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
  -#else
  -   return mas2  MAS2_ATTRIB_MASK;
  +   mas2_attr |= MAS2_M;
  #endif
  +   }
  +   return mas2_attr;
  }
 
  /*
  @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe(
  /* Force IPROT=0 for all guest mappings. */
  stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | 
  MAS1_VALID;
  stlbe-mas2 = (gvaddr  MAS2_EPN) |
  - e500_shadow_mas2_attrib(gtlbe-mas2, pr);
  + e500_shadow_mas2_attrib(gtlbe-mas2, pfn);
  stlbe-mas7_3 = ((u64)pfn  PAGE_SHIFT) |
  e500_shadow_mas3_attrib(gtlbe-mas7_3, pr);
 
 
 
 
 
  --
  To unsubscribe from this list: send the line unsubscribe kvm-ppc in
  the body of a message to majord...@vger.kernel.org More majordomo
  info at http://vger.kernel.org/majordomo-info.html
 
 



Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread “tiejun.chen”

On 07/18/2013 02:04 PM, Bharat Bhushan wrote:

If there is a struct page for the requested mapping then it's
normal DDR and the mapping sets M bit (coherent, cacheable)
else this is treated as I/O and we set  I + G  (cache inhibited, guarded)

This helps setting proper TLB mapping for direct assigned device

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
  arch/powerpc/kvm/e500_mmu_host.c |   17 -
  1 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
index 1c6a9d7..089c227 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int 
usermode)
return mas3;
  }

-static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
+static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
  {
+   u32 mas2_attr;
+
+   mas2_attr = mas2  MAS2_ATTRIB_MASK;
+
+   if (!pfn_valid(pfn)) {
+   mas2_attr |= MAS2_I | MAS2_G;
+   } else {
  #ifdef CONFIG_SMP
-   return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
-#else
-   return mas2  MAS2_ATTRIB_MASK;
+   mas2_attr |= MAS2_M;
  #endif
+   }


Additionally, in UP case this little chunk of code is equivalent to

if (1) {
mas2_attr |= MAS2_I | MAS2_G;
} else {
}

So you'd better wrapper MAS2_m in advance like,

#ifdef CONFIG_SMP
#define M_IF_SMPMAS2_M
#else
#define M_IF_SMP0
#endif

Then
if (1)
mas2_attr |= MAS2_I | MAS2_G;
else
mas2_attr |= M_IF_SMP;

Tiejun


+   return mas2_attr;
  }

  /*
@@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe(
/* Force IPROT=0 for all guest mappings. */
stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID;
stlbe-mas2 = (gvaddr  MAS2_EPN) |
- e500_shadow_mas2_attrib(gtlbe-mas2, pr);
+ e500_shadow_mas2_attrib(gtlbe-mas2, pfn);
stlbe-mas7_3 = ((u64)pfn  PAGE_SHIFT) |
e500_shadow_mas3_attrib(gtlbe-mas7_3, pr);




--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Bhushan Bharat-R65777
 Sent: Thursday, July 18, 2013 1:53 PM
 To: '“tiejun.chen”'
 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott-
 B07421
 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
 managed pages
 
 
 
  -Original Message-
  From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
  Sent: Thursday, July 18, 2013 1:52 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood
  Scott-
  B07421
  Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
  kernel managed pages
 
  On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote:
  
  
   -Original Message-
   From: kvm-ppc-ow...@vger.kernel.org
   [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen”
   Sent: Thursday, July 18, 2013 1:01 PM
   To: Bhushan Bharat-R65777
   Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de;
   Wood
   Scott-
   B07421
   Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
   kernel managed pages
  
   On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:
  
  
   -Original Message-
   From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
   Sent: Thursday, July 18, 2013 11:56 AM
   To: Bhushan Bharat-R65777
   Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de;
   Wood
   Scott- B07421; Bhushan Bharat-R65777
   Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only
   for kernel managed pages
  
   On 07/18/2013 02:04 PM, Bharat Bhushan wrote:
   If there is a struct page for the requested mapping then it's
   normal DDR and the mapping sets M bit (coherent, cacheable)
   else this is treated as I/O and we set  I + G  (cache
   inhibited,
   guarded)
  
   This helps setting proper TLB mapping for direct assigned device
  
   Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
   ---
   arch/powerpc/kvm/e500_mmu_host.c |   17 -
   1 files changed, 12 insertions(+), 5 deletions(-)
  
   diff --git a/arch/powerpc/kvm/e500_mmu_host.c
   b/arch/powerpc/kvm/e500_mmu_host.c
   index 1c6a9d7..089c227 100644
   --- a/arch/powerpc/kvm/e500_mmu_host.c
   +++ b/arch/powerpc/kvm/e500_mmu_host.c
   @@ -64,13 +64,20 @@ static inline u32
   e500_shadow_mas3_attrib(u32 mas3, int
   usermode)
 return mas3;
   }
  
   -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int
   usermode)
   +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
   {
   + u32 mas2_attr;
   +
   + mas2_attr = mas2  MAS2_ATTRIB_MASK;
   +
   + if (!pfn_valid(pfn)) {
  
   Why not directly use kvm_is_mmio_pfn()?
  
   What I understand from this function (someone can correct me) is
   that it
   returns false when the page is managed by kernel and is not
   marked as RESERVED (for some reason). For us it does not matter
   whether the page is reserved or not, if it is kernel visible page then it
 is DDR.
  
  
   I think you are setting I|G by addressing all mmio pages, right? If
   so,
  
 KVM: direct mmio pfn check
  
 Userspace may specify memory slots that are backed by mmio
   pages rather than
 normal RAM.  In some cases it is not enough to identify these
   mmio
  pages
 by pfn_valid().  This patch adds checking the PageReserved as well.
  
   Do you know what are those some cases and how checking
   PageReserved helps in
  those cases?
 
  No, myself didn't see these actual cases in qemu,too. But this should
  be chronically persistent as I understand ;-)
 
 Then I will wait till someone educate me :)

The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want 
to call this for all tlbwe operation unless it is necessary.

-Bharat

   + mas2_attr |= MAS2_I | MAS2_G;
   + } else {
   #ifdef CONFIG_SMP
   - return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
   -#else
   - return mas2  MAS2_ATTRIB_MASK;
   + mas2_attr |= MAS2_M;
   #endif
   + }
   + return mas2_attr;
   }
  
   /*
   @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe(
 /* Force IPROT=0 for all guest mappings. */
 stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | 
   MAS1_VALID;
 stlbe-mas2 = (gvaddr  MAS2_EPN) |
   -   e500_shadow_mas2_attrib(gtlbe-mas2, pr);
   +   e500_shadow_mas2_attrib(gtlbe-mas2, pfn);
 stlbe-mas7_3 = ((u64)pfn  PAGE_SHIFT) |
 e500_shadow_mas3_attrib(gtlbe-mas7_3, pr);
  
  
  
  
  
   --
   To unsubscribe from this list: send the line unsubscribe kvm-ppc
   in the body of a message to majord...@vger.kernel.org More
   majordomo info at http://vger.kernel.org/majordomo-info.html
  
 

N�r��yb�X��ǧv�^�)޺{.n�+h����ܨ}���Ơz�j:+v���zZ+��+zf���h���~i���z��w���?��)ߢf

Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread “tiejun.chen”

On 07/18/2013 04:25 PM, Bhushan Bharat-R65777 wrote:




-Original Message-
From: Bhushan Bharat-R65777
Sent: Thursday, July 18, 2013 1:53 PM
To: '“tiejun.chen”'
Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott-
B07421
Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
managed pages




-Original Message-
From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
Sent: Thursday, July 18, 2013 1:52 PM
To: Bhushan Bharat-R65777
Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood
Scott-
B07421
Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
kernel managed pages

On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote:




-Original Message-
From: kvm-ppc-ow...@vger.kernel.org
[mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen”
Sent: Thursday, July 18, 2013 1:01 PM
To: Bhushan Bharat-R65777
Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de;
Wood
Scott-
B07421
Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
kernel managed pages

On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:




-Original Message-
From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
Sent: Thursday, July 18, 2013 11:56 AM
To: Bhushan Bharat-R65777
Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de;
Wood
Scott- B07421; Bhushan Bharat-R65777
Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only
for kernel managed pages

On 07/18/2013 02:04 PM, Bharat Bhushan wrote:

If there is a struct page for the requested mapping then it's
normal DDR and the mapping sets M bit (coherent, cacheable)
else this is treated as I/O and we set  I + G  (cache
inhibited,
guarded)

This helps setting proper TLB mapping for direct assigned device

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 arch/powerpc/kvm/e500_mmu_host.c |   17 -
 1 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/e500_mmu_host.c
b/arch/powerpc/kvm/e500_mmu_host.c
index 1c6a9d7..089c227 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -64,13 +64,20 @@ static inline u32
e500_shadow_mas3_attrib(u32 mas3, int

usermode)

return mas3;
 }

-static inline u32 e500_shadow_mas2_attrib(u32 mas2, int
usermode)
+static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
 {
+   u32 mas2_attr;
+
+   mas2_attr = mas2  MAS2_ATTRIB_MASK;
+
+   if (!pfn_valid(pfn)) {


Why not directly use kvm_is_mmio_pfn()?


What I understand from this function (someone can correct me) is
that it

returns false when the page is managed by kernel and is not
marked as RESERVED (for some reason). For us it does not matter
whether the page is reserved or not, if it is kernel visible page then it

is DDR.




I think you are setting I|G by addressing all mmio pages, right? If
so,

   KVM: direct mmio pfn check

   Userspace may specify memory slots that are backed by mmio
pages rather than
   normal RAM.  In some cases it is not enough to identify these
mmio

pages

   by pfn_valid().  This patch adds checking the PageReserved as well.


Do you know what are those some cases and how checking
PageReserved helps in

those cases?

No, myself didn't see these actual cases in qemu,too. But this should
be chronically persistent as I understand ;-)


Then I will wait till someone educate me :)


The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want 
to call this for all tlbwe operation unless it is necessary.


Furthermore, how to distinguish we're creating TLB entry for the device assigned 
directly to the GS?


I think its unnecessary to always check if that is mmio's pfn since we have more 
non direct assigned devices.


So maybe we can introduce another helper to fixup that TLB entry in instead of 
this path.


Tiejun



-Bharat


+   mas2_attr |= MAS2_I | MAS2_G;
+   } else {
 #ifdef CONFIG_SMP
-   return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
-#else
-   return mas2  MAS2_ATTRIB_MASK;
+   mas2_attr |= MAS2_M;
 #endif
+   }
+   return mas2_attr;
 }

 /*
@@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe(
/* Force IPROT=0 for all guest mappings. */
stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID;
stlbe-mas2 = (gvaddr  MAS2_EPN) |
- e500_shadow_mas2_attrib(gtlbe-mas2, pr);
+ e500_shadow_mas2_attrib(gtlbe-mas2, pfn);
stlbe-mas7_3 = ((u64)pfn  PAGE_SHIFT) |
e500_shadow_mas3_attrib(gtlbe-mas7_3, pr);








--
To unsubscribe from this list: send the line unsubscribe kvm-ppc
in the body of a message to majord...@vger.kernel.org More
majordomo info at http://vger.kernel.org/majordomo-info.html








--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to 

Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Alexander Graf

On 18.07.2013, at 10:55, “tiejun.chen” wrote:

 On 07/18/2013 04:25 PM, Bhushan Bharat-R65777 wrote:
 
 
 -Original Message-
 From: Bhushan Bharat-R65777
 Sent: Thursday, July 18, 2013 1:53 PM
 To: '“tiejun.chen”'
 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott-
 B07421
 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
 managed pages
 
 
 
 -Original Message-
 From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
 Sent: Thursday, July 18, 2013 1:52 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood
 Scott-
 B07421
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
 kernel managed pages
 
 On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote:
 
 
 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org
 [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen”
 Sent: Thursday, July 18, 2013 1:01 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de;
 Wood
 Scott-
 B07421
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
 kernel managed pages
 
 On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:
 
 
 -Original Message-
 From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
 Sent: Thursday, July 18, 2013 11:56 AM
 To: Bhushan Bharat-R65777
 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de;
 Wood
 Scott- B07421; Bhushan Bharat-R65777
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only
 for kernel managed pages
 
 On 07/18/2013 02:04 PM, Bharat Bhushan wrote:
 If there is a struct page for the requested mapping then it's
 normal DDR and the mapping sets M bit (coherent, cacheable)
 else this is treated as I/O and we set  I + G  (cache
 inhibited,
 guarded)
 
 This helps setting proper TLB mapping for direct assigned device
 
 Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
 ---
 arch/powerpc/kvm/e500_mmu_host.c |   17 -
 1 files changed, 12 insertions(+), 5 deletions(-)
 
 diff --git a/arch/powerpc/kvm/e500_mmu_host.c
 b/arch/powerpc/kvm/e500_mmu_host.c
 index 1c6a9d7..089c227 100644
 --- a/arch/powerpc/kvm/e500_mmu_host.c
 +++ b/arch/powerpc/kvm/e500_mmu_host.c
 @@ -64,13 +64,20 @@ static inline u32
 e500_shadow_mas3_attrib(u32 mas3, int
 usermode)
   return mas3;
 }
 
 -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int
 usermode)
 +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
 {
 + u32 mas2_attr;
 +
 + mas2_attr = mas2  MAS2_ATTRIB_MASK;
 +
 + if (!pfn_valid(pfn)) {
 
 Why not directly use kvm_is_mmio_pfn()?
 
 What I understand from this function (someone can correct me) is
 that it
 returns false when the page is managed by kernel and is not
 marked as RESERVED (for some reason). For us it does not matter
 whether the page is reserved or not, if it is kernel visible page then it
 is DDR.
 
 
 I think you are setting I|G by addressing all mmio pages, right? If
 so,
 
   KVM: direct mmio pfn check
 
   Userspace may specify memory slots that are backed by mmio
 pages rather than
   normal RAM.  In some cases it is not enough to identify these
 mmio
 pages
   by pfn_valid().  This patch adds checking the PageReserved as well.
 
 Do you know what are those some cases and how checking
 PageReserved helps in
 those cases?
 
 No, myself didn't see these actual cases in qemu,too. But this should
 be chronically persistent as I understand ;-)
 
 Then I will wait till someone educate me :)
 
 The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not 
 want to call this for all tlbwe operation unless it is necessary.
 
 Furthermore, how to distinguish we're creating TLB entry for the device 
 assigned directly to the GS?

Because other devices wouldn't be available to the guest through memory slots.

 I think its unnecessary to always check if that is mmio's pfn since we have 
 more non direct assigned devices.

I'm not sure I understand. The shadow TLB code only knows here is a host 
virtual address. It needs to figure out whether the host physical address 
behind that is RAM (can access with cache enabled) or not (has to disable cache)

 So maybe we can introduce another helper to fixup that TLB entry in instead 
 of this path.

This path does fix up the shadow (host) TLB entry :).


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Alexander Graf

On 18.07.2013, at 10:25, Bhushan Bharat-R65777 wrote:

 
 
 -Original Message-
 From: Bhushan Bharat-R65777
 Sent: Thursday, July 18, 2013 1:53 PM
 To: '“tiejun.chen”'
 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott-
 B07421
 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
 managed pages
 
 
 
 -Original Message-
 From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
 Sent: Thursday, July 18, 2013 1:52 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood
 Scott-
 B07421
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
 kernel managed pages
 
 On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote:
 
 
 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org
 [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen”
 Sent: Thursday, July 18, 2013 1:01 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de;
 Wood
 Scott-
 B07421
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
 kernel managed pages
 
 On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:
 
 
 -Original Message-
 From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
 Sent: Thursday, July 18, 2013 11:56 AM
 To: Bhushan Bharat-R65777
 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de;
 Wood
 Scott- B07421; Bhushan Bharat-R65777
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only
 for kernel managed pages
 
 On 07/18/2013 02:04 PM, Bharat Bhushan wrote:
 If there is a struct page for the requested mapping then it's
 normal DDR and the mapping sets M bit (coherent, cacheable)
 else this is treated as I/O and we set  I + G  (cache
 inhibited,
 guarded)
 
 This helps setting proper TLB mapping for direct assigned device
 
 Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
 ---
arch/powerpc/kvm/e500_mmu_host.c |   17 -
1 files changed, 12 insertions(+), 5 deletions(-)
 
 diff --git a/arch/powerpc/kvm/e500_mmu_host.c
 b/arch/powerpc/kvm/e500_mmu_host.c
 index 1c6a9d7..089c227 100644
 --- a/arch/powerpc/kvm/e500_mmu_host.c
 +++ b/arch/powerpc/kvm/e500_mmu_host.c
 @@ -64,13 +64,20 @@ static inline u32
 e500_shadow_mas3_attrib(u32 mas3, int
 usermode)
return mas3;
}
 
 -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int
 usermode)
 +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
{
 +  u32 mas2_attr;
 +
 +  mas2_attr = mas2  MAS2_ATTRIB_MASK;
 +
 +  if (!pfn_valid(pfn)) {
 
 Why not directly use kvm_is_mmio_pfn()?
 
 What I understand from this function (someone can correct me) is
 that it
 returns false when the page is managed by kernel and is not
 marked as RESERVED (for some reason). For us it does not matter
 whether the page is reserved or not, if it is kernel visible page then it
 is DDR.
 
 
 I think you are setting I|G by addressing all mmio pages, right? If
 so,
 
  KVM: direct mmio pfn check
 
  Userspace may specify memory slots that are backed by mmio
 pages rather than
  normal RAM.  In some cases it is not enough to identify these
 mmio
 pages
  by pfn_valid().  This patch adds checking the PageReserved as well.
 
 Do you know what are those some cases and how checking
 PageReserved helps in
 those cases?
 
 No, myself didn't see these actual cases in qemu,too. But this should
 be chronically persistent as I understand ;-)
 
 Then I will wait till someone educate me :)
 
 The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not 
 want to call this for all tlbwe operation unless it is necessary.

It certainly does more than we need and potentially slows down the fast path 
(RAM mapping). The only thing it does on top of if (pfn_valid()) is to check 
for pages that are declared reserved on the host. This happens in 2 cases:

  1) Non cache coherent DMA
  2) Memory hot remove

The non coherent DMA case would be interesting, as with the mechanism as it is 
in place in Linux today, we could potentially break normal guest operation if 
we don't take it into account. However, it's Kconfig guarded by:

depends on 4xx || 8xx || E200 || PPC_MPC512x || GAMECUBE_COMMON
default n if PPC_47x
default y

so we never hit it with any core we care about ;).

Memory hot remove does not exist on e500 FWIW, so we don't have to worry about 
that one either.

Which means I think it's fine to slim this whole thing down to only check for 
pfn_valid(), as the code does in this patch. It would however be very useful to 
have a comment there that explains why it's safe to do so.



Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Bhushan Bharat-R65777


 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On
 Behalf Of Alexander Graf
 Sent: Thursday, July 18, 2013 3:19 PM
 To: Bhushan Bharat-R65777
 Cc: “tiejun.chen”; kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-
 B07421
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
 managed pages
 
 
 On 18.07.2013, at 10:25, Bhushan Bharat-R65777 wrote:
 
 
 
  -Original Message-
  From: Bhushan Bharat-R65777
  Sent: Thursday, July 18, 2013 1:53 PM
  To: '“tiejun.chen”'
  Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood
  Scott-
  B07421
  Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for
  kernel managed pages
 
 
 
  -Original Message-
  From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
  Sent: Thursday, July 18, 2013 1:52 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de;
  Wood
  Scott-
  B07421
  Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
  kernel managed pages
 
  On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote:
 
 
  -Original Message-
  From: kvm-ppc-ow...@vger.kernel.org
  [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen”
  Sent: Thursday, July 18, 2013 1:01 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de;
  Wood
  Scott-
  B07421
  Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only
  for kernel managed pages
 
  On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:
 
 
  -Original Message-
  From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
  Sent: Thursday, July 18, 2013 11:56 AM
  To: Bhushan Bharat-R65777
  Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de;
  Wood
  Scott- B07421; Bhushan Bharat-R65777
  Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only
  for kernel managed pages
 
  On 07/18/2013 02:04 PM, Bharat Bhushan wrote:
  If there is a struct page for the requested mapping then it's
  normal DDR and the mapping sets M bit (coherent, cacheable)
  else this is treated as I/O and we set  I + G  (cache
  inhibited,
  guarded)
 
  This helps setting proper TLB mapping for direct assigned
  device
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
 arch/powerpc/kvm/e500_mmu_host.c |   17 -
 1 files changed, 12 insertions(+), 5 deletions(-)
 
  diff --git a/arch/powerpc/kvm/e500_mmu_host.c
  b/arch/powerpc/kvm/e500_mmu_host.c
  index 1c6a9d7..089c227 100644
  --- a/arch/powerpc/kvm/e500_mmu_host.c
  +++ b/arch/powerpc/kvm/e500_mmu_host.c
  @@ -64,13 +64,20 @@ static inline u32
  e500_shadow_mas3_attrib(u32 mas3, int
  usermode)
   return mas3;
 }
 
  -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int
  usermode)
  +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
 {
  +u32 mas2_attr;
  +
  +mas2_attr = mas2  MAS2_ATTRIB_MASK;
  +
  +if (!pfn_valid(pfn)) {
 
  Why not directly use kvm_is_mmio_pfn()?
 
  What I understand from this function (someone can correct me) is
  that it
  returns false when the page is managed by kernel and is not
  marked as RESERVED (for some reason). For us it does not matter
  whether the page is reserved or not, if it is kernel visible page
  then it
  is DDR.
 
 
  I think you are setting I|G by addressing all mmio pages, right?
  If so,
 
   KVM: direct mmio pfn check
 
   Userspace may specify memory slots that are backed by mmio
  pages rather than
   normal RAM.  In some cases it is not enough to identify these
  mmio
  pages
   by pfn_valid().  This patch adds checking the PageReserved as well.
 
  Do you know what are those some cases and how checking
  PageReserved helps in
  those cases?
 
  No, myself didn't see these actual cases in qemu,too. But this
  should be chronically persistent as I understand ;-)
 
  Then I will wait till someone educate me :)
 
  The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not
 want to call this for all tlbwe operation unless it is necessary.
 
 It certainly does more than we need and potentially slows down the fast path
 (RAM mapping). The only thing it does on top of if (pfn_valid()) is to check
 for pages that are declared reserved on the host. This happens in 2 cases:
 
   1) Non cache coherent DMA
   2) Memory hot remove
 
 The non coherent DMA case would be interesting, as with the mechanism as it is
 in place in Linux today, we could potentially break normal guest operation if 
 we
 don't take it into account. However, it's Kconfig guarded by:
 
 depends on 4xx || 8xx || E200 || PPC_MPC512x || GAMECUBE_COMMON
 default n if PPC_47x
 default y
 
 so we never hit it with any core we care about ;).
 
 Memory hot remove does not exist on e500 FWIW, so we don't have to worry about
 that one either.
 
 Which means I think it's fine to slim 

Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread “tiejun.chen”

On 07/18/2013 05:44 PM, Alexander Graf wrote:


On 18.07.2013, at 10:55, �tiejun.chen� wrote:


On 07/18/2013 04:25 PM, Bhushan Bharat-R65777 wrote:




-Original Message-
From: Bhushan Bharat-R65777
Sent: Thursday, July 18, 2013 1:53 PM
To: '�tiejun.chen�'
Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott-
B07421
Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
managed pages




-Original Message-
From: �tiejun.chen� [mailto:tiejun.c...@windriver.com]
Sent: Thursday, July 18, 2013 1:52 PM
To: Bhushan Bharat-R65777
Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood
Scott-
B07421
Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
kernel managed pages

On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote:




-Original Message-
From: kvm-ppc-ow...@vger.kernel.org
[mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of �tiejun.chen�
Sent: Thursday, July 18, 2013 1:01 PM
To: Bhushan Bharat-R65777
Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de;
Wood
Scott-
B07421
Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
kernel managed pages

On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:




-Original Message-
From: �tiejun.chen� [mailto:tiejun.c...@windriver.com]
Sent: Thursday, July 18, 2013 11:56 AM
To: Bhushan Bharat-R65777
Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de;
Wood
Scott- B07421; Bhushan Bharat-R65777
Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only
for kernel managed pages

On 07/18/2013 02:04 PM, Bharat Bhushan wrote:

If there is a struct page for the requested mapping then it's
normal DDR and the mapping sets M bit (coherent, cacheable)
else this is treated as I/O and we set  I + G  (cache
inhibited,
guarded)

This helps setting proper TLB mapping for direct assigned device

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 arch/powerpc/kvm/e500_mmu_host.c |   17 -
 1 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/e500_mmu_host.c
b/arch/powerpc/kvm/e500_mmu_host.c
index 1c6a9d7..089c227 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -64,13 +64,20 @@ static inline u32
e500_shadow_mas3_attrib(u32 mas3, int

usermode)

return mas3;
 }

-static inline u32 e500_shadow_mas2_attrib(u32 mas2, int
usermode)
+static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
 {
+   u32 mas2_attr;
+
+   mas2_attr = mas2  MAS2_ATTRIB_MASK;
+
+   if (!pfn_valid(pfn)) {


Why not directly use kvm_is_mmio_pfn()?


What I understand from this function (someone can correct me) is
that it

returns false when the page is managed by kernel and is not
marked as RESERVED (for some reason). For us it does not matter
whether the page is reserved or not, if it is kernel visible page then it

is DDR.




I think you are setting I|G by addressing all mmio pages, right? If
so,

   KVM: direct mmio pfn check

   Userspace may specify memory slots that are backed by mmio
pages rather than
   normal RAM.  In some cases it is not enough to identify these
mmio

pages

   by pfn_valid().  This patch adds checking the PageReserved as well.


Do you know what are those some cases and how checking
PageReserved helps in

those cases?

No, myself didn't see these actual cases in qemu,too. But this should
be chronically persistent as I understand ;-)


Then I will wait till someone educate me :)


The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want 
to call this for all tlbwe operation unless it is necessary.


Furthermore, how to distinguish we're creating TLB entry for the device 
assigned directly to the GS?


Because other devices wouldn't be available to the guest through memory slots.


Yes.




I think its unnecessary to always check if that is mmio's pfn since we have 
more non direct assigned devices.


I'm not sure I understand. The shadow TLB code only knows here is a host virtual 
address. It needs to figure out whether the host physical address behind that is 
RAM (can access with cache enabled) or not (has to disable cache)



Sorry, looks I'm misleading you :-P


So maybe we can introduce another helper to fixup that TLB entry in instead of 
this path.


This path does fix up the shadow (host) TLB entry :).



I just mean whether we can have a separate path dedicated to those direct 
assigned devices, not go this common path :)


Tiejun
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Alexander Graf

On 18.07.2013, at 11:56, “tiejun.chen” wrote:

 On 07/18/2013 05:44 PM, Alexander Graf wrote:
 
 On 18.07.2013, at 10:55, �tiejun.chen� wrote:
 
 On 07/18/2013 04:25 PM, Bhushan Bharat-R65777 wrote:
 
 
 -Original Message-
 From: Bhushan Bharat-R65777
 Sent: Thursday, July 18, 2013 1:53 PM
 To: '�tiejun.chen�'
 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood 
 Scott-
 B07421
 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
 managed pages
 
 
 
 -Original Message-
 From: �tiejun.chen� [mailto:tiejun.c...@windriver.com]
 Sent: Thursday, July 18, 2013 1:52 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood
 Scott-
 B07421
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
 kernel managed pages
 
 On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote:
 
 
 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org
 [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of �tiejun.chen�
 Sent: Thursday, July 18, 2013 1:01 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de;
 Wood
 Scott-
 B07421
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
 kernel managed pages
 
 On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:
 
 
 -Original Message-
 From: �tiejun.chen� [mailto:tiejun.c...@windriver.com]
 Sent: Thursday, July 18, 2013 11:56 AM
 To: Bhushan Bharat-R65777
 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de;
 Wood
 Scott- B07421; Bhushan Bharat-R65777
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only
 for kernel managed pages
 
 On 07/18/2013 02:04 PM, Bharat Bhushan wrote:
 If there is a struct page for the requested mapping then it's
 normal DDR and the mapping sets M bit (coherent, cacheable)
 else this is treated as I/O and we set  I + G  (cache
 inhibited,
 guarded)
 
 This helps setting proper TLB mapping for direct assigned device
 
 Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
 ---
 arch/powerpc/kvm/e500_mmu_host.c |   17 -
 1 files changed, 12 insertions(+), 5 deletions(-)
 
 diff --git a/arch/powerpc/kvm/e500_mmu_host.c
 b/arch/powerpc/kvm/e500_mmu_host.c
 index 1c6a9d7..089c227 100644
 --- a/arch/powerpc/kvm/e500_mmu_host.c
 +++ b/arch/powerpc/kvm/e500_mmu_host.c
 @@ -64,13 +64,20 @@ static inline u32
 e500_shadow_mas3_attrib(u32 mas3, int
 usermode)
 return mas3;
 }
 
 -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int
 usermode)
 +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
 {
 +   u32 mas2_attr;
 +
 +   mas2_attr = mas2  MAS2_ATTRIB_MASK;
 +
 +   if (!pfn_valid(pfn)) {
 
 Why not directly use kvm_is_mmio_pfn()?
 
 What I understand from this function (someone can correct me) is
 that it
 returns false when the page is managed by kernel and is not
 marked as RESERVED (for some reason). For us it does not matter
 whether the page is reserved or not, if it is kernel visible page then 
 it
 is DDR.
 
 
 I think you are setting I|G by addressing all mmio pages, right? If
 so,
 
   KVM: direct mmio pfn check
 
   Userspace may specify memory slots that are backed by mmio
 pages rather than
   normal RAM.  In some cases it is not enough to identify these
 mmio
 pages
   by pfn_valid().  This patch adds checking the PageReserved as 
 well.
 
 Do you know what are those some cases and how checking
 PageReserved helps in
 those cases?
 
 No, myself didn't see these actual cases in qemu,too. But this should
 be chronically persistent as I understand ;-)
 
 Then I will wait till someone educate me :)
 
 The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not 
 want to call this for all tlbwe operation unless it is necessary.
 
 Furthermore, how to distinguish we're creating TLB entry for the device 
 assigned directly to the GS?
 
 Because other devices wouldn't be available to the guest through memory 
 slots.
 
 Yes.
 
 
 I think its unnecessary to always check if that is mmio's pfn since we have 
 more non direct assigned devices.
 
 I'm not sure I understand. The shadow TLB code only knows here is a host 
 virtual address. It needs to figure out whether the host physical address 
 behind that is RAM (can access with cache enabled) or not (has to disable 
 cache)
 
 
 Sorry, looks I'm misleading you :-P
 
 So maybe we can introduce another helper to fixup that TLB entry in instead 
 of this path.
 
 This path does fix up the shadow (host) TLB entry :).
 
 
 I just mean whether we can have a separate path dedicated to those direct 
 assigned devices, not go this common path :)

I don't think it's possible to have a separate path without a certain level of 
trust. In the current flow we don't trust anyone. We just check every 
translated page whether we should enable caching or not.

We could take that information from 2 other side though:

  1) Memory Slot
 

Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread “tiejun.chen”

On 07/18/2013 05:48 PM, Alexander Graf wrote:


On 18.07.2013, at 10:25, Bhushan Bharat-R65777 wrote:





-Original Message-
From: Bhushan Bharat-R65777
Sent: Thursday, July 18, 2013 1:53 PM
To: '�tiejun.chen�'
Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott-
B07421
Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
managed pages




-Original Message-
From: �tiejun.chen� [mailto:tiejun.c...@windriver.com]
Sent: Thursday, July 18, 2013 1:52 PM
To: Bhushan Bharat-R65777
Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood
Scott-
B07421
Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
kernel managed pages

On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote:




-Original Message-
From: kvm-ppc-ow...@vger.kernel.org
[mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of �tiejun.chen�
Sent: Thursday, July 18, 2013 1:01 PM
To: Bhushan Bharat-R65777
Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de;
Wood
Scott-
B07421
Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
kernel managed pages

On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:




-Original Message-
From: �tiejun.chen� [mailto:tiejun.c...@windriver.com]
Sent: Thursday, July 18, 2013 11:56 AM
To: Bhushan Bharat-R65777
Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de;
Wood
Scott- B07421; Bhushan Bharat-R65777
Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only
for kernel managed pages

On 07/18/2013 02:04 PM, Bharat Bhushan wrote:

If there is a struct page for the requested mapping then it's
normal DDR and the mapping sets M bit (coherent, cacheable)
else this is treated as I/O and we set  I + G  (cache
inhibited,
guarded)

This helps setting proper TLB mapping for direct assigned device

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
arch/powerpc/kvm/e500_mmu_host.c |   17 -
1 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/e500_mmu_host.c
b/arch/powerpc/kvm/e500_mmu_host.c
index 1c6a9d7..089c227 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -64,13 +64,20 @@ static inline u32
e500_shadow_mas3_attrib(u32 mas3, int

usermode)

return mas3;
}

-static inline u32 e500_shadow_mas2_attrib(u32 mas2, int
usermode)
+static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
{
+   u32 mas2_attr;
+
+   mas2_attr = mas2  MAS2_ATTRIB_MASK;
+
+   if (!pfn_valid(pfn)) {


Why not directly use kvm_is_mmio_pfn()?


What I understand from this function (someone can correct me) is
that it

returns false when the page is managed by kernel and is not
marked as RESERVED (for some reason). For us it does not matter
whether the page is reserved or not, if it is kernel visible page then it

is DDR.




I think you are setting I|G by addressing all mmio pages, right? If
so,

  KVM: direct mmio pfn check

  Userspace may specify memory slots that are backed by mmio
pages rather than
  normal RAM.  In some cases it is not enough to identify these
mmio

pages

  by pfn_valid().  This patch adds checking the PageReserved as well.


Do you know what are those some cases and how checking
PageReserved helps in

those cases?

No, myself didn't see these actual cases in qemu,too. But this should
be chronically persistent as I understand ;-)


Then I will wait till someone educate me :)


The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want 
to call this for all tlbwe operation unless it is necessary.


It certainly does more than we need and potentially slows down the fast path (RAM 
mapping). The only thing it does on top of if (pfn_valid()) is to check for 
pages that are declared reserved on the host. This happens in 2 cases:

   1) Non cache coherent DMA
   2) Memory hot remove

The non coherent DMA case would be interesting, as with the mechanism as it is 
in place in Linux today, we could potentially break normal guest operation if 
we don't take it into account. However, it's Kconfig guarded by:

 depends on 4xx || 8xx || E200 || PPC_MPC512x || GAMECUBE_COMMON
 default n if PPC_47x
 default y

so we never hit it with any core we care about ;).

Memory hot remove does not exist on e500 FWIW, so we don't have to worry about 
that one either.


Thanks for this good information :)

So why not limit those codes with CONFIG_MEMORY_HOTPLUG inside kvm_is_mmio_pfn() 
to make sure that check is only valid when that is really needed? This can 
decrease those unnecessary performance loss.


If I'm wrong please correct me :)

Tiejun
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Alexander Graf

On 18.07.2013, at 12:08, “tiejun.chen” wrote:

 On 07/18/2013 05:48 PM, Alexander Graf wrote:
 
 On 18.07.2013, at 10:25, Bhushan Bharat-R65777 wrote:
 
 
 
 -Original Message-
 From: Bhushan Bharat-R65777
 Sent: Thursday, July 18, 2013 1:53 PM
 To: '�tiejun.chen�'
 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood 
 Scott-
 B07421
 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
 managed pages
 
 
 
 -Original Message-
 From: �tiejun.chen� [mailto:tiejun.c...@windriver.com]
 Sent: Thursday, July 18, 2013 1:52 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood
 Scott-
 B07421
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
 kernel managed pages
 
 On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote:
 
 
 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org
 [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of �tiejun.chen�
 Sent: Thursday, July 18, 2013 1:01 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de;
 Wood
 Scott-
 B07421
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
 kernel managed pages
 
 On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:
 
 
 -Original Message-
 From: �tiejun.chen� [mailto:tiejun.c...@windriver.com]
 Sent: Thursday, July 18, 2013 11:56 AM
 To: Bhushan Bharat-R65777
 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de;
 Wood
 Scott- B07421; Bhushan Bharat-R65777
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only
 for kernel managed pages
 
 On 07/18/2013 02:04 PM, Bharat Bhushan wrote:
 If there is a struct page for the requested mapping then it's
 normal DDR and the mapping sets M bit (coherent, cacheable)
 else this is treated as I/O and we set  I + G  (cache
 inhibited,
 guarded)
 
 This helps setting proper TLB mapping for direct assigned device
 
 Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
 ---
arch/powerpc/kvm/e500_mmu_host.c |   17 -
1 files changed, 12 insertions(+), 5 deletions(-)
 
 diff --git a/arch/powerpc/kvm/e500_mmu_host.c
 b/arch/powerpc/kvm/e500_mmu_host.c
 index 1c6a9d7..089c227 100644
 --- a/arch/powerpc/kvm/e500_mmu_host.c
 +++ b/arch/powerpc/kvm/e500_mmu_host.c
 @@ -64,13 +64,20 @@ static inline u32
 e500_shadow_mas3_attrib(u32 mas3, int
 usermode)
  return mas3;
}
 
 -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int
 usermode)
 +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
{
 +u32 mas2_attr;
 +
 +mas2_attr = mas2  MAS2_ATTRIB_MASK;
 +
 +if (!pfn_valid(pfn)) {
 
 Why not directly use kvm_is_mmio_pfn()?
 
 What I understand from this function (someone can correct me) is
 that it
 returns false when the page is managed by kernel and is not
 marked as RESERVED (for some reason). For us it does not matter
 whether the page is reserved or not, if it is kernel visible page then 
 it
 is DDR.
 
 
 I think you are setting I|G by addressing all mmio pages, right? If
 so,
 
  KVM: direct mmio pfn check
 
  Userspace may specify memory slots that are backed by mmio
 pages rather than
  normal RAM.  In some cases it is not enough to identify these
 mmio
 pages
  by pfn_valid().  This patch adds checking the PageReserved as well.
 
 Do you know what are those some cases and how checking
 PageReserved helps in
 those cases?
 
 No, myself didn't see these actual cases in qemu,too. But this should
 be chronically persistent as I understand ;-)
 
 Then I will wait till someone educate me :)
 
 The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not 
 want to call this for all tlbwe operation unless it is necessary.
 
 It certainly does more than we need and potentially slows down the fast path 
 (RAM mapping). The only thing it does on top of if (pfn_valid()) is to 
 check for pages that are declared reserved on the host. This happens in 2 
 cases:
 
   1) Non cache coherent DMA
   2) Memory hot remove
 
 The non coherent DMA case would be interesting, as with the mechanism as it 
 is in place in Linux today, we could potentially break normal guest 
 operation if we don't take it into account. However, it's Kconfig guarded by:
 
 depends on 4xx || 8xx || E200 || PPC_MPC512x || GAMECUBE_COMMON
 default n if PPC_47x
 default y
 
 so we never hit it with any core we care about ;).
 
 Memory hot remove does not exist on e500 FWIW, so we don't have to worry 
 about that one either.
 
 Thanks for this good information :)
 
 So why not limit those codes with CONFIG_MEMORY_HOTPLUG inside 
 kvm_is_mmio_pfn() to make sure that check is only valid when that is really 
 needed? This can decrease those unnecessary performance loss.
 
 If I'm wrong please correct me :)

You're perfectly right, but this is generic KVM code. So it gets run across all 
architectures. What if someone has the great idea 

Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread “tiejun.chen”

On 07/18/2013 06:00 PM, Alexander Graf wrote:


On 18.07.2013, at 11:56, “tiejun.chen” wrote:


On 07/18/2013 05:44 PM, Alexander Graf wrote:


On 18.07.2013, at 10:55, �tiejun.chen� wrote:


On 07/18/2013 04:25 PM, Bhushan Bharat-R65777 wrote:




-Original Message-
From: Bhushan Bharat-R65777
Sent: Thursday, July 18, 2013 1:53 PM
To: '�tiejun.chen�'
Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott-
B07421
Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
managed pages




-Original Message-
From: �tiejun.chen� [mailto:tiejun.c...@windriver.com]
Sent: Thursday, July 18, 2013 1:52 PM
To: Bhushan Bharat-R65777
Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood
Scott-
B07421
Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
kernel managed pages

On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote:




-Original Message-
From: kvm-ppc-ow...@vger.kernel.org
[mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of �tiejun.chen�
Sent: Thursday, July 18, 2013 1:01 PM
To: Bhushan Bharat-R65777
Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de;
Wood
Scott-
B07421
Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
kernel managed pages

On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:




-Original Message-
From: �tiejun.chen� [mailto:tiejun.c...@windriver.com]
Sent: Thursday, July 18, 2013 11:56 AM
To: Bhushan Bharat-R65777
Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de;
Wood
Scott- B07421; Bhushan Bharat-R65777
Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only
for kernel managed pages

On 07/18/2013 02:04 PM, Bharat Bhushan wrote:

If there is a struct page for the requested mapping then it's
normal DDR and the mapping sets M bit (coherent, cacheable)
else this is treated as I/O and we set  I + G  (cache
inhibited,
guarded)

This helps setting proper TLB mapping for direct assigned device

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 arch/powerpc/kvm/e500_mmu_host.c |   17 -
 1 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/e500_mmu_host.c
b/arch/powerpc/kvm/e500_mmu_host.c
index 1c6a9d7..089c227 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -64,13 +64,20 @@ static inline u32
e500_shadow_mas3_attrib(u32 mas3, int

usermode)

return mas3;
 }

-static inline u32 e500_shadow_mas2_attrib(u32 mas2, int
usermode)
+static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
 {
+   u32 mas2_attr;
+
+   mas2_attr = mas2  MAS2_ATTRIB_MASK;
+
+   if (!pfn_valid(pfn)) {


Why not directly use kvm_is_mmio_pfn()?


What I understand from this function (someone can correct me) is
that it

returns false when the page is managed by kernel and is not
marked as RESERVED (for some reason). For us it does not matter
whether the page is reserved or not, if it is kernel visible page then it

is DDR.




I think you are setting I|G by addressing all mmio pages, right? If
so,

   KVM: direct mmio pfn check

   Userspace may specify memory slots that are backed by mmio
pages rather than
   normal RAM.  In some cases it is not enough to identify these
mmio

pages

   by pfn_valid().  This patch adds checking the PageReserved as well.


Do you know what are those some cases and how checking
PageReserved helps in

those cases?

No, myself didn't see these actual cases in qemu,too. But this should
be chronically persistent as I understand ;-)


Then I will wait till someone educate me :)


The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want 
to call this for all tlbwe operation unless it is necessary.


Furthermore, how to distinguish we're creating TLB entry for the device 
assigned directly to the GS?


Because other devices wouldn't be available to the guest through memory slots.


Yes.




I think its unnecessary to always check if that is mmio's pfn since we have 
more non direct assigned devices.


I'm not sure I understand. The shadow TLB code only knows here is a host virtual 
address. It needs to figure out whether the host physical address behind that is 
RAM (can access with cache enabled) or not (has to disable cache)



Sorry, looks I'm misleading you :-P


So maybe we can introduce another helper to fixup that TLB entry in instead of 
this path.


This path does fix up the shadow (host) TLB entry :).



I just mean whether we can have a separate path dedicated to those direct 
assigned devices, not go this common path :)


I don't think it's possible to have a separate path without a certain level of 
trust. In the current flow we don't trust anyone. We just check every 
translated page whether we should enable caching or not.

We could take that information from 2 other side though:

   1) Memory Slot
   2) Guest TLB Flags

If we take it from the memory slot we 

Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread “tiejun.chen”

On 07/18/2013 06:12 PM, Alexander Graf wrote:


On 18.07.2013, at 12:08, “tiejun.chen” wrote:


On 07/18/2013 05:48 PM, Alexander Graf wrote:


On 18.07.2013, at 10:25, Bhushan Bharat-R65777 wrote:





-Original Message-
From: Bhushan Bharat-R65777
Sent: Thursday, July 18, 2013 1:53 PM
To: '�tiejun.chen�'
Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott-
B07421
Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
managed pages




-Original Message-
From: �tiejun.chen� [mailto:tiejun.c...@windriver.com]
Sent: Thursday, July 18, 2013 1:52 PM
To: Bhushan Bharat-R65777
Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood
Scott-
B07421
Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
kernel managed pages

On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote:




-Original Message-
From: kvm-ppc-ow...@vger.kernel.org
[mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of �tiejun.chen�
Sent: Thursday, July 18, 2013 1:01 PM
To: Bhushan Bharat-R65777
Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de;
Wood
Scott-
B07421
Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
kernel managed pages

On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:




-Original Message-
From: �tiejun.chen� [mailto:tiejun.c...@windriver.com]
Sent: Thursday, July 18, 2013 11:56 AM
To: Bhushan Bharat-R65777
Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de;
Wood
Scott- B07421; Bhushan Bharat-R65777
Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only
for kernel managed pages

On 07/18/2013 02:04 PM, Bharat Bhushan wrote:

If there is a struct page for the requested mapping then it's
normal DDR and the mapping sets M bit (coherent, cacheable)
else this is treated as I/O and we set  I + G  (cache
inhibited,
guarded)

This helps setting proper TLB mapping for direct assigned device

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
arch/powerpc/kvm/e500_mmu_host.c |   17 -
1 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/e500_mmu_host.c
b/arch/powerpc/kvm/e500_mmu_host.c
index 1c6a9d7..089c227 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -64,13 +64,20 @@ static inline u32
e500_shadow_mas3_attrib(u32 mas3, int

usermode)

return mas3;
}

-static inline u32 e500_shadow_mas2_attrib(u32 mas2, int
usermode)
+static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
{
+   u32 mas2_attr;
+
+   mas2_attr = mas2  MAS2_ATTRIB_MASK;
+
+   if (!pfn_valid(pfn)) {


Why not directly use kvm_is_mmio_pfn()?


What I understand from this function (someone can correct me) is
that it

returns false when the page is managed by kernel and is not
marked as RESERVED (for some reason). For us it does not matter
whether the page is reserved or not, if it is kernel visible page then it

is DDR.




I think you are setting I|G by addressing all mmio pages, right? If
so,

  KVM: direct mmio pfn check

  Userspace may specify memory slots that are backed by mmio
pages rather than
  normal RAM.  In some cases it is not enough to identify these
mmio

pages

  by pfn_valid().  This patch adds checking the PageReserved as well.


Do you know what are those some cases and how checking
PageReserved helps in

those cases?

No, myself didn't see these actual cases in qemu,too. But this should
be chronically persistent as I understand ;-)


Then I will wait till someone educate me :)


The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want 
to call this for all tlbwe operation unless it is necessary.


It certainly does more than we need and potentially slows down the fast path (RAM 
mapping). The only thing it does on top of if (pfn_valid()) is to check for 
pages that are declared reserved on the host. This happens in 2 cases:

   1) Non cache coherent DMA
   2) Memory hot remove

The non coherent DMA case would be interesting, as with the mechanism as it is 
in place in Linux today, we could potentially break normal guest operation if 
we don't take it into account. However, it's Kconfig guarded by:

 depends on 4xx || 8xx || E200 || PPC_MPC512x || GAMECUBE_COMMON
 default n if PPC_47x
 default y

so we never hit it with any core we care about ;).

Memory hot remove does not exist on e500 FWIW, so we don't have to worry about 
that one either.


Thanks for this good information :)

So why not limit those codes with CONFIG_MEMORY_HOTPLUG inside 
kvm_is_mmio_pfn() to make sure that check is only valid when that is really 
needed? This can decrease those unnecessary performance loss.

If I'm wrong please correct me :)


You're perfectly right, but this is generic KVM code. So it gets run across all 
architectures. What if someone has the great idea to add a new case here for 
x86, but doesn't tell us? In 

Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Alexander Graf

On 18.07.2013, at 12:19, “tiejun.chen” wrote:

 On 07/18/2013 06:12 PM, Alexander Graf wrote:
 
 On 18.07.2013, at 12:08, “tiejun.chen” wrote:
 
 On 07/18/2013 05:48 PM, Alexander Graf wrote:
 
 On 18.07.2013, at 10:25, Bhushan Bharat-R65777 wrote:
 
 
 
 -Original Message-
 From: Bhushan Bharat-R65777
 Sent: Thursday, July 18, 2013 1:53 PM
 To: '�tiejun.chen�'
 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood 
 Scott-
 B07421
 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for 
 kernel
 managed pages
 
 
 
 -Original Message-
 From: �tiejun.chen� [mailto:tiejun.c...@windriver.com]
 Sent: Thursday, July 18, 2013 1:52 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood
 Scott-
 B07421
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
 kernel managed pages
 
 On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote:
 
 
 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org
 [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of �tiejun.chen�
 Sent: Thursday, July 18, 2013 1:01 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de;
 Wood
 Scott-
 B07421
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
 kernel managed pages
 
 On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:
 
 
 -Original Message-
 From: �tiejun.chen� [mailto:tiejun.c...@windriver.com]
 Sent: Thursday, July 18, 2013 11:56 AM
 To: Bhushan Bharat-R65777
 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de;
 Wood
 Scott- B07421; Bhushan Bharat-R65777
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only
 for kernel managed pages
 
 On 07/18/2013 02:04 PM, Bharat Bhushan wrote:
 If there is a struct page for the requested mapping then it's
 normal DDR and the mapping sets M bit (coherent, cacheable)
 else this is treated as I/O and we set  I + G  (cache
 inhibited,
 guarded)
 
 This helps setting proper TLB mapping for direct assigned device
 
 Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
 ---
arch/powerpc/kvm/e500_mmu_host.c |   17 -
1 files changed, 12 insertions(+), 5 deletions(-)
 
 diff --git a/arch/powerpc/kvm/e500_mmu_host.c
 b/arch/powerpc/kvm/e500_mmu_host.c
 index 1c6a9d7..089c227 100644
 --- a/arch/powerpc/kvm/e500_mmu_host.c
 +++ b/arch/powerpc/kvm/e500_mmu_host.c
 @@ -64,13 +64,20 @@ static inline u32
 e500_shadow_mas3_attrib(u32 mas3, int
 usermode)
return mas3;
}
 
 -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int
 usermode)
 +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
{
 +  u32 mas2_attr;
 +
 +  mas2_attr = mas2  MAS2_ATTRIB_MASK;
 +
 +  if (!pfn_valid(pfn)) {
 
 Why not directly use kvm_is_mmio_pfn()?
 
 What I understand from this function (someone can correct me) is
 that it
 returns false when the page is managed by kernel and is not
 marked as RESERVED (for some reason). For us it does not matter
 whether the page is reserved or not, if it is kernel visible page 
 then it
 is DDR.
 
 
 I think you are setting I|G by addressing all mmio pages, right? If
 so,
 
  KVM: direct mmio pfn check
 
  Userspace may specify memory slots that are backed by mmio
 pages rather than
  normal RAM.  In some cases it is not enough to identify these
 mmio
 pages
  by pfn_valid().  This patch adds checking the PageReserved as 
 well.
 
 Do you know what are those some cases and how checking
 PageReserved helps in
 those cases?
 
 No, myself didn't see these actual cases in qemu,too. But this should
 be chronically persistent as I understand ;-)
 
 Then I will wait till someone educate me :)
 
 The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do 
 not want to call this for all tlbwe operation unless it is necessary.
 
 It certainly does more than we need and potentially slows down the fast 
 path (RAM mapping). The only thing it does on top of if (pfn_valid()) is 
 to check for pages that are declared reserved on the host. This happens in 
 2 cases:
 
   1) Non cache coherent DMA
   2) Memory hot remove
 
 The non coherent DMA case would be interesting, as with the mechanism as 
 it is in place in Linux today, we could potentially break normal guest 
 operation if we don't take it into account. However, it's Kconfig guarded 
 by:
 
 depends on 4xx || 8xx || E200 || PPC_MPC512x || GAMECUBE_COMMON
 default n if PPC_47x
 default y
 
 so we never hit it with any core we care about ;).
 
 Memory hot remove does not exist on e500 FWIW, so we don't have to worry 
 about that one either.
 
 Thanks for this good information :)
 
 So why not limit those codes with CONFIG_MEMORY_HOTPLUG inside 
 kvm_is_mmio_pfn() to make sure that check is only valid when that is really 
 needed? This can decrease those unnecessary performance loss.
 
 If I'm wrong please correct me :)
 
 You're perfectly right, 

Re: [RFC PATCH] kvm-unit-tests : Basic architecture of VMX nested test case

2013-07-18 Thread Paolo Bonzini
Il 18/07/2013 09:26, Gleb Natapov ha scritto:
  I had written a long explanation here about why I don't trust the
  compiler to do the right thing, and ideas about how to fix that.  But in
  the end the only workable solution is a single assembly blob like vmx.c
  in KVM to do vmlaunch/vmresume, so I'll get right to the point:
  
 *  the similarity with KVM code and as little assembly as  *
 *  possible goals are mutually exclusive *
 
 I never said that code should be similar to KVM code or have as little
 assembly as possible :) Reread the thread. The only thing I asked for
 is to make code flow linear, if it makes code looks more like KVM or
 reduce amount of assembly this is just a bonus.

Point taken.

  and for a testsuite I'd prefer the latter---which means I'd still favor
  setjmp/longjmp.
  
  Now, here is the long explanation.
  
  I must admit that the code looks nice.  There are some nits I'd like to
  see done differently (such as putting vmx_return at the beginning of the
  while (1), and the vmresume asm at the end), but it is indeed nice.

 Why do you prefer setjmp/longjmp then?

Because it is still deceiving, and I dislike the deceit more than I like
the linear code flow.

 Agree, I dislike this magic too, but this is fixed by you suggestion
 above about putting vmx_return at the beginning of while(). So the logic
 will looks like that:
 
 asm volatile(vmlaunch;setbe %0\n\t : =m(ret));
 while(ret) {

while(!ret) if you use setbe, of course.

vmx_return:
SAVE_GPR_C
eax = exit_handler();
switch(eax) {
}
LOAD_GPR_C
asm volatile(vmresume;seta %0\n\t : =m(ret));
 }

It is still somewhat magic: the while (ret) is only there to please
the compiler, and you rely on the compiler not changing %rsp between the
vmlaunch and the vmx_return label.  Minor nit, you cannot anymore print
different error messages for vmlaunch vs. vmresume failure.

In the end the choice is between all in asm and small asm trampoline
(which also happens to use setjmp/longjmp, but perhaps Arthur can
propagate return codes without using setjmp/longjmp, too).

 Rewriting the whole guest entry exit path in asm like you suggest bellow
 will eliminate the magic too.

 I much prefer one big asm statement than many small asm statement
 scattered among two or three C lines.

It's not many asm statements, it's just a dozen lines:


align   4, 0x90
entry_vmx:
SAVE_GPR
call default_exit_handler
/* Should not reach here*/
mov $1, %eax
call exit
.align  4, 0x90
entry_sysenter:
SAVE_GPR
and $0xf, %eax
mov %eax, %edi
calldefault_syscall_handler
/* Arthur, is something missing here? :)) */

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH] kvm-unit-tests : Basic architecture of VMX nested test case

2013-07-18 Thread Gleb Natapov
On Thu, Jul 18, 2013 at 12:47:46PM +0200, Paolo Bonzini wrote:
   and for a testsuite I'd prefer the latter---which means I'd still favor
   setjmp/longjmp.
   
   Now, here is the long explanation.
   
   I must admit that the code looks nice.  There are some nits I'd like to
   see done differently (such as putting vmx_return at the beginning of the
   while (1), and the vmresume asm at the end), but it is indeed nice.
 
  Why do you prefer setjmp/longjmp then?
 
 Because it is still deceiving, and I dislike the deceit more than I like
 the linear code flow.
 
What is deceiving about it? Of course for someone who has no idea how
vmx works the code will not be obvious, but why should we care. For
someone who knows what is deceiving about returning into the same
function guest was launched from by using VMX mechanism instead of
longjmp()?

  Agree, I dislike this magic too, but this is fixed by you suggestion
  above about putting vmx_return at the beginning of while(). So the logic
  will looks like that:
  
  asm volatile(vmlaunch;setbe %0\n\t : =m(ret));
  while(ret) {
 
 while(!ret) if you use setbe, of course.
 
Yeah, this not meant to be compilable code :)

 vmx_return:
 SAVE_GPR_C
 eax = exit_handler();
 switch(eax) {
 }
 LOAD_GPR_C
 asm volatile(vmresume;seta %0\n\t : =m(ret));
  }
 
 It is still somewhat magic: the while (ret) is only there to please
No, it it there to catch vmlaunch/vmresume errors.

 the compiler, and you rely on the compiler not changing %rsp between the
 vmlaunch and the vmx_return label.  Minor nit, you cannot anymore print
HOST_RSP should be loaded on each guest entry.

 different error messages for vmlaunch vs. vmresume failure.
Just because the same variable is used to store error values :)
Make vmlaunch use err1 and vmresume err2 and do while (!err1  !err2)

 
 In the end the choice is between all in asm and small asm trampoline
 (which also happens to use setjmp/longjmp, but perhaps Arthur can
 propagate return codes without using setjmp/longjmp, too).
 
  Rewriting the whole guest entry exit path in asm like you suggest bellow
  will eliminate the magic too.
 
  I much prefer one big asm statement than many small asm statement
  scattered among two or three C lines.
 
 It's not many asm statements, it's just a dozen lines:
 
I am not talking about overall file, but the how vmx_run() is written:
asm()
C code
asm()
C code
...

I much prefer:
C code

big asm() blob

C code.


--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH] kvm-unit-tests : Basic architecture of VMX nested test case

2013-07-18 Thread Paolo Bonzini
Il 18/07/2013 13:06, Gleb Natapov ha scritto:
 On Thu, Jul 18, 2013 at 12:47:46PM +0200, Paolo Bonzini wrote:
 and for a testsuite I'd prefer the latter---which means I'd still favor
 setjmp/longjmp.

 Now, here is the long explanation.

 I must admit that the code looks nice.  There are some nits I'd like to
 see done differently (such as putting vmx_return at the beginning of the
 while (1), and the vmresume asm at the end), but it is indeed nice.

 Why do you prefer setjmp/longjmp then?

 Because it is still deceiving, and I dislike the deceit more than I like
 the linear code flow.

 What is deceiving about it? Of course for someone who has no idea how
 vmx works the code will not be obvious, but why should we care. For
 someone who knows what is deceiving about returning into the same
 function guest was launched from by using VMX mechanism

The way the code is written is deceiving.  If I see

  asm(vmlaunch; seta %0)
  while (ret)

I expect HOST_RIP to point at the seta or somewhere near, not at a
completely different label somewhere else.

 instead of longjmp()?

Look again at the setjmp/longjmp version.  longjmp is not used to handle
vmexit.  It is used to jump back from the vmexit handler to main, which
is exactly what setjmp/longjmp is meant for.

 It is still somewhat magic: the while (ret) is only there to please
 the compiler

 No, it it there to catch vmlaunch/vmresume errors.

You could change it to while (0) and it would still work.  That's what
I mean by only there to please the compiler.

 the compiler, and you rely on the compiler not changing %rsp between the
 vmlaunch and the vmx_return label.  Minor nit, you cannot anymore print
 HOST_RSP should be loaded on each guest entry.

Right, but my point is: without a big asm blob, you don't know the right
value to load.  It depends on the generated code.  And the big asm blob
limits a lot the code looks nice value of this approach.

 different error messages for vmlaunch vs. vmresume failure.
 Just because the same variable is used to store error values :)
 Make vmlaunch use err1 and vmresume err2 and do while (!err1  !err2)

Yeah. :)

 In the end the choice is between all in asm and small asm trampoline
 (which also happens to use setjmp/longjmp, but perhaps Arthur can
 propagate return codes without using setjmp/longjmp, too).

 Rewriting the whole guest entry exit path in asm like you suggest bellow
 will eliminate the magic too.

 I much prefer one big asm statement than many small asm statement
 scattered among two or three C lines.

 It's not many asm statements, it's just a dozen lines:

 I am not talking about overall file, but the how vmx_run() is written:
 asm()
 C code
 asm()
 C code
 ...
 
 I much prefer:
 C code
 
 big asm() blob
 
 C code.

Me too.  But the trampoline version is neither, it is

static inline void vmlaunch() { asm(vmlaunch) }
static inline void vmresume() { asm(vmresume) }
small asm() blob
C code

Paolo

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [ [PATCH]] nVMX: Initialize IA32_FEATURE_CONTROL MSR in reset and migration

2013-07-18 Thread Eduardo Habkost
On Tue, Jul 16, 2013 at 03:01:58PM +0300, Gleb Natapov wrote:
 On Tue, Jul 16, 2013 at 07:56:25PM +0800, Arthur Chunqi Li wrote:
  On Tue, Jul 16, 2013 at 7:42 PM, Gleb Natapov g...@redhat.com wrote:
   On Sun, Jul 07, 2013 at 11:13:37PM +0800, Arthur Chunqi Li wrote:
   The recent KVM patch adds IA32_FEATURE_CONTROL support. QEMU needs
   to clear this MSR when reset vCPU and keep the value of it when
   migration. This patch add this feature.
  
   So what happens if we migrate from qemu that does not have this patch
   to qemu that does? Since msr_ia32_feature_control will not be migrated
   it will not be set on the destination so destination will not be able to
   use nested vmx. Since nested vmx is experimental it may be to early for
   us to care about it though, and nested vmx does not work with migration
   anyway.
  In my test, if migration doesn't care about msr_ia32_feature_control,
  the value will be set to 0 in the destination VM and this may cause
  some logical confusions, but the VMX running on it may not aware of
  this (if migration nested vmx is supported in the future) because once
  VMX initialized, it will not check this msr any more in normal cases.
  
 With vmm_exclusive=0 kvm does vmxon/vmxoff while running. But lest not
 worry about nested kvm migration for now. There are much harder problems
 to overcome before it will work.
 
  This is also a complex problem since we don't know how many states
  like this msr need to be migrated related to nested virt. If there're
  a lot of states need migrating, it is better to reconstruct the
  relevant codes. But now this patch is enough.
  
  Besides, though migration is not supported in nested vmx, we should
  keep the machine state consistent during migration. So this patch is
  also meaningful.

I'm assuming that even qemu-1.6 -machine pc-1.5 is not expected to
allow migration to a qemu-1.5 binary. Is that OK for everybody, or
should we support backwards migration?

Other than that, the patch looks good to me. If migrating from a version
that doesn't have the patch, we are just going to get the same behavior
we had before.

  
  Arthur
  
   Signed-off-by: Arthur Chunqi Li yzt...@gmail.com
   ---
target-i386/cpu.h |2 ++
target-i386/kvm.c |4 
target-i386/machine.c |   22 ++
3 files changed, 28 insertions(+)
  
   diff --git a/target-i386/cpu.h b/target-i386/cpu.h
   index 62e3547..a418e17 100644
   --- a/target-i386/cpu.h
   +++ b/target-i386/cpu.h
   @@ -301,6 +301,7 @@
#define MSR_IA32_APICBASE_BSP   (18)
#define MSR_IA32_APICBASE_ENABLE(111)
#define MSR_IA32_APICBASE_BASE  (0xf12)
   +#define MSR_IA32_FEATURE_CONTROL0x003a
#define MSR_TSC_ADJUST  0x003b
#define MSR_IA32_TSCDEADLINE0x6e0
  
   @@ -813,6 +814,7 @@ typedef struct CPUX86State {
  
uint64_t mcg_status;
uint64_t msr_ia32_misc_enable;
   +uint64_t msr_ia32_feature_control;
  
/* exception/interrupt handling */
int error_code;
   diff --git a/target-i386/kvm.c b/target-i386/kvm.c
   index 39f4fbb..3cb2161 100644
   --- a/target-i386/kvm.c
   +++ b/target-i386/kvm.c
   @@ -1122,6 +1122,7 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
if (hyperv_vapic_recommended()) {
kvm_msr_entry_set(msrs[n++], HV_X64_MSR_APIC_ASSIST_PAGE, 
   0);
}
   +kvm_msr_entry_set(msrs[n++], MSR_IA32_FEATURE_CONTROL, 
   env-msr_ia32_feature_control);
}
if (env-mcg_cap) {
int i;
   @@ -1346,6 +1347,7 @@ static int kvm_get_msrs(X86CPU *cpu)
if (has_msr_misc_enable) {
msrs[n++].index = MSR_IA32_MISC_ENABLE;
}
   +msrs[n++].index = MSR_IA32_FEATURE_CONTROL;
  
if (!env-tsc_valid) {
msrs[n++].index = MSR_IA32_TSC;
   @@ -1444,6 +1446,8 @@ static int kvm_get_msrs(X86CPU *cpu)
case MSR_IA32_MISC_ENABLE:
env-msr_ia32_misc_enable = msrs[i].data;
break;
   +case MSR_IA32_FEATURE_CONTROL:
   +env-msr_ia32_feature_control = msrs[i].data;
default:
if (msrs[i].index = MSR_MC0_CTL 
msrs[i].index  MSR_MC0_CTL + (env-mcg_cap  0xff) * 
   4) {
   diff --git a/target-i386/machine.c b/target-i386/machine.c
   index 3659db9..94ca914 100644
   --- a/target-i386/machine.c
   +++ b/target-i386/machine.c
   @@ -399,6 +399,14 @@ static bool misc_enable_needed(void *opaque)
return env-msr_ia32_misc_enable != MSR_IA32_MISC_ENABLE_DEFAULT;
}
  
   +static bool feature_control_needed(void *opaque)
   +{
   +X86CPU *cpu = opaque;
   +CPUX86State *env = cpu-env;
   +
   +return env-msr_ia32_feature_control != 0;
   +}
   +
static const VMStateDescription vmstate_msr_ia32_misc_enable = {
.name = cpu/msr_ia32_misc_enable,
.version_id = 1,
   @@ -410,6 +418,17 @@ static 

[PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2

2013-07-18 Thread Bharat Bhushan
Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v2:
 - No change 
 arch/powerpc/kvm/e500.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h
index c2e5e98..277cb18 100644
--- a/arch/powerpc/kvm/e500.h
+++ b/arch/powerpc/kvm/e500.h
@@ -117,7 +117,7 @@ static inline struct kvmppc_vcpu_e500 *to_e500(struct 
kvm_vcpu *vcpu)
 #define E500_TLB_USER_PERM_MASK (MAS3_UX|MAS3_UR|MAS3_UW)
 #define E500_TLB_SUPER_PERM_MASK (MAS3_SX|MAS3_SR|MAS3_SW)
 #define MAS2_ATTRIB_MASK \
- (MAS2_X0 | MAS2_X1)
+ (MAS2_X0 | MAS2_X1 | MAS2_E)
 #define MAS3_ATTRIB_MASK \
  (MAS3_U0 | MAS3_U1 | MAS3_U2 | MAS3_U3 \
   | E500_TLB_USER_PERM_MASK | E500_TLB_SUPER_PERM_MASK)
-- 
1.7.0.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Bharat Bhushan
If there is a struct page for the requested mapping then it's
normal RAM and the mapping is set to M bit (coherent, cacheable)
otherwise this is treated as I/O and we set  I + G  (cache inhibited, guarded)

This helps setting proper TLB mapping for direct assigned device

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v2: some cleanup and added comment
 - 
 arch/powerpc/kvm/e500_mmu_host.c |   23 ++-
 1 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
index 1c6a9d7..02eb973 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -64,13 +64,26 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int 
usermode)
return mas3;
 }
 
-static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
+static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
 {
+   u32 mas2_attr;
+
+   mas2_attr = mas2  MAS2_ATTRIB_MASK;
+
+   /*
+* RAM is always mappable on e500 systems, so this is identical
+* to kvm_is_mmio_pfn(), just without its overhead.
+*/
+   if (!pfn_valid(pfn)) {
+   /* Pages not managed by Kernel are treated as I/O, set I + G */
+   mas2_attr |= MAS2_I | MAS2_G;
 #ifdef CONFIG_SMP
-   return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
-#else
-   return mas2  MAS2_ATTRIB_MASK;
+   } else {
+   /* Kernel managed pages are actually RAM so set  M */
+   mas2_attr |= MAS2_M;
 #endif
+   }
+   return mas2_attr;
 }
 
 /*
@@ -313,7 +326,7 @@ static void kvmppc_e500_setup_stlbe(
/* Force IPROT=0 for all guest mappings. */
stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID;
stlbe-mas2 = (gvaddr  MAS2_EPN) |
- e500_shadow_mas2_attrib(gtlbe-mas2, pr);
+ e500_shadow_mas2_attrib(gtlbe-mas2, pfn);
stlbe-mas7_3 = ((u64)pfn  PAGE_SHIFT) |
e500_shadow_mas3_attrib(gtlbe-mas7_3, pr);
 
-- 
1.7.0.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] PF: Async page fault support on s390

2013-07-18 Thread Paolo Bonzini
Il 11/07/2013 12:41, Christian Borntraeger ha scritto:
 On 11/07/13 11:04, Gleb Natapov wrote:
 On Wed, Jul 10, 2013 at 02:59:55PM +0200, Dominik Dingel wrote:
 This patch enables async page faults for s390 kvm guests.
 It provides the userspace API to enable, disable or get the status of this
 feature. Also it includes the diagnose code, called by the guest to enable
 async page faults.

 The async page faults will use an already existing guest interface for this
 purpose, as described in CP Programming Services (SC24-6084).

 Signed-off-by: Dominik Dingel din...@linux.vnet.ibm.com
 Christian, looks good now?
 
 Looks good, but I just had a  discussion with Dominik about several other 
 cases 
 (guest driven reboot, qemu driven reboot, life migration). This patch should 
 allow all these cases (independent from this patch we need an ioctl to flush 
 the
 list of pending interrupts to do so, but reboot is currently broken in that
 regard anyway - patch is currently being looked at)
 
 We are currently discussion if we should get rid of the APF_STATUS and let 
 the kernel wait for outstanding page faults before returning from KVM_RUN
 or if we go with this patch and let userspace wait for completion. 
 
 Will discuss this with Dominik, Conny and Alex. So lets defer that till next
 week, ok?

Let us know if we should wait for a v5. :)

Paolo

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH] kvm-unit-tests : Basic architecture of VMX nested test case

2013-07-18 Thread Arthur Chunqi Li
On Thu, Jul 18, 2013 at 8:08 PM, Paolo Bonzini pbonz...@redhat.com wrote:
 Il 18/07/2013 13:06, Gleb Natapov ha scritto:
 On Thu, Jul 18, 2013 at 12:47:46PM +0200, Paolo Bonzini wrote:
 and for a testsuite I'd prefer the latter---which means I'd still favor
 setjmp/longjmp.

 Now, here is the long explanation.

 I must admit that the code looks nice.  There are some nits I'd like to
 see done differently (such as putting vmx_return at the beginning of the
 while (1), and the vmresume asm at the end), but it is indeed nice.

 Why do you prefer setjmp/longjmp then?

 Because it is still deceiving, and I dislike the deceit more than I like
 the linear code flow.

 What is deceiving about it? Of course for someone who has no idea how
 vmx works the code will not be obvious, but why should we care. For
 someone who knows what is deceiving about returning into the same
 function guest was launched from by using VMX mechanism

 The way the code is written is deceiving.  If I see

   asm(vmlaunch; seta %0)
   while (ret)

 I expect HOST_RIP to point at the seta or somewhere near, not at a
 completely different label somewhere else.
Here for me, I prefer a separate asm blob of entry_vmx instead of one
hidden in a C function, which may make it much harder to trace for
someone not familiar with vmx in KVM.

 instead of longjmp()?

 Look again at the setjmp/longjmp version.  longjmp is not used to handle
 vmexit.  It is used to jump back from the vmexit handler to main, which
 is exactly what setjmp/longjmp is meant for.

 It is still somewhat magic: the while (ret) is only there to please
 the compiler

 No, it it there to catch vmlaunch/vmresume errors.

 You could change it to while (0) and it would still work.  That's what
 I mean by only there to please the compiler.

 the compiler, and you rely on the compiler not changing %rsp between the
 vmlaunch and the vmx_return label.  Minor nit, you cannot anymore print
 HOST_RSP should be loaded on each guest entry.

 Right, but my point is: without a big asm blob, you don't know the right
 value to load.  It depends on the generated code.  And the big asm blob
 limits a lot the code looks nice value of this approach.

 different error messages for vmlaunch vs. vmresume failure.
 Just because the same variable is used to store error values :)
 Make vmlaunch use err1 and vmresume err2 and do while (!err1  !err2)

 Yeah. :)

 In the end the choice is between all in asm and small asm trampoline
 (which also happens to use setjmp/longjmp, but perhaps Arthur can
 propagate return codes without using setjmp/longjmp, too).

 Rewriting the whole guest entry exit path in asm like you suggest bellow
 will eliminate the magic too.

 I much prefer one big asm statement than many small asm statement
 scattered among two or three C lines.

 It's not many asm statements, it's just a dozen lines:

 I am not talking about overall file, but the how vmx_run() is written:
 asm()
 C code
 asm()
 C code
 ...

 I much prefer:
 C code

 big asm() blob

 C code.

 Me too.  But the trampoline version is neither, it is

 static inline void vmlaunch() { asm(vmlaunch) }
 static inline void vmresume() { asm(vmresume) }
 small asm() blob
 C code
So this is a style problem on the basis of right code generation
indeed. When I write codes of this version, it occurs that the
compiler optimized some of my codes and something goes wrong. If we
use C style, we need setjmp/longjmp, otherwise we need big asm blob to
forbid compiler optimization.

I prefer to Paolo indeed as to my writing the two versions. Writing
asm in C is sometimes uncomfortable (e.g. %rax and %%rax, and
considering the C codes before and after the asm blob). Actually, we
can hide setjmp in vmx_on and longjmp in the asm codes after executing
exit_handler, thus we just need to call vmx_on to enter VM and return
a designed value (e.g. -1) in exit_handler to exit VM. Then any
following codes don't need to care about setjmp/longjmp.

Arthur

 Paolo

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] PF: Async page fault support on s390

2013-07-18 Thread Christian Borntraeger
On 18/07/13 15:57, Paolo Bonzini wrote:
 Il 11/07/2013 12:41, Christian Borntraeger ha scritto:
 On 11/07/13 11:04, Gleb Natapov wrote:
 On Wed, Jul 10, 2013 at 02:59:55PM +0200, Dominik Dingel wrote:
 This patch enables async page faults for s390 kvm guests.
 It provides the userspace API to enable, disable or get the status of this
 feature. Also it includes the diagnose code, called by the guest to enable
 async page faults.

 The async page faults will use an already existing guest interface for this
 purpose, as described in CP Programming Services (SC24-6084).

 Signed-off-by: Dominik Dingel din...@linux.vnet.ibm.com
 Christian, looks good now?

 Looks good, but I just had a  discussion with Dominik about several other 
 cases 
 (guest driven reboot, qemu driven reboot, life migration). This patch should 
 allow all these cases (independent from this patch we need an ioctl to flush 
 the
 list of pending interrupts to do so, but reboot is currently broken in that
 regard anyway - patch is currently being looked at)

 We are currently discussion if we should get rid of the APF_STATUS and let 
 the kernel wait for outstanding page faults before returning from KVM_RUN
 or if we go with this patch and let userspace wait for completion. 

 Will discuss this with Dominik, Conny and Alex. So lets defer that till next
 week, ok?
 
 Let us know if we should wait for a v5. :)

Yes, there will be a v5


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2

2013-07-18 Thread Alexander Graf

This needs a description. Why shouldn't we ignore E?


Alex

On 18.07.2013, at 15:19, Bharat Bhushan wrote:

 Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
 ---
 v2:
 - No change 
 arch/powerpc/kvm/e500.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)
 
 diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h
 index c2e5e98..277cb18 100644
 --- a/arch/powerpc/kvm/e500.h
 +++ b/arch/powerpc/kvm/e500.h
 @@ -117,7 +117,7 @@ static inline struct kvmppc_vcpu_e500 *to_e500(struct 
 kvm_vcpu *vcpu)
 #define E500_TLB_USER_PERM_MASK (MAS3_UX|MAS3_UR|MAS3_UW)
 #define E500_TLB_SUPER_PERM_MASK (MAS3_SX|MAS3_SR|MAS3_SW)
 #define MAS2_ATTRIB_MASK \
 -   (MAS2_X0 | MAS2_X1)
 +   (MAS2_X0 | MAS2_X1 | MAS2_E)
 #define MAS3_ATTRIB_MASK \
 (MAS3_U0 | MAS3_U1 | MAS3_U2 | MAS3_U3 \
  | E500_TLB_USER_PERM_MASK | E500_TLB_SUPER_PERM_MASK)
 -- 
 1.7.0.4
 
 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Alexander Graf

On 18.07.2013, at 15:19, Bharat Bhushan wrote:

 If there is a struct page for the requested mapping then it's
 normal RAM and the mapping is set to M bit (coherent, cacheable)
 otherwise this is treated as I/O and we set  I + G  (cache inhibited, 
 guarded)
 
 This helps setting proper TLB mapping for direct assigned device
 
 Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
 ---
 v2: some cleanup and added comment
 - 
 arch/powerpc/kvm/e500_mmu_host.c |   23 ++-
 1 files changed, 18 insertions(+), 5 deletions(-)
 
 diff --git a/arch/powerpc/kvm/e500_mmu_host.c 
 b/arch/powerpc/kvm/e500_mmu_host.c
 index 1c6a9d7..02eb973 100644
 --- a/arch/powerpc/kvm/e500_mmu_host.c
 +++ b/arch/powerpc/kvm/e500_mmu_host.c
 @@ -64,13 +64,26 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int 
 usermode)
   return mas3;
 }
 
 -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
 +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
 {
 + u32 mas2_attr;
 +
 + mas2_attr = mas2  MAS2_ATTRIB_MASK;
 +
 + /*
 +  * RAM is always mappable on e500 systems, so this is identical
 +  * to kvm_is_mmio_pfn(), just without its overhead.
 +  */
 + if (!pfn_valid(pfn)) {
 + /* Pages not managed by Kernel are treated as I/O, set I + G */

Please also document the intermediate thought that I/O should be mapped 
non-cached.

 + mas2_attr |= MAS2_I | MAS2_G;
 #ifdef CONFIG_SMP

Please separate the SMP case out of the branch.

 - return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
 -#else
 - return mas2  MAS2_ATTRIB_MASK;
 + } else {
 + /* Kernel managed pages are actually RAM so set  M */

This comment doesn't tell me why M can be set ;).


Alex

 + mas2_attr |= MAS2_M;
 #endif
 + }
 + return mas2_attr;
 }
 
 /*
 @@ -313,7 +326,7 @@ static void kvmppc_e500_setup_stlbe(
   /* Force IPROT=0 for all guest mappings. */
   stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID;
   stlbe-mas2 = (gvaddr  MAS2_EPN) |
 -   e500_shadow_mas2_attrib(gtlbe-mas2, pr);
 +   e500_shadow_mas2_attrib(gtlbe-mas2, pfn);
   stlbe-mas7_3 = ((u64)pfn  PAGE_SHIFT) |
   e500_shadow_mas3_attrib(gtlbe-mas7_3, pr);
 
 -- 
 1.7.0.4
 
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm-ppc in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2

2013-07-18 Thread Bhushan Bharat-R65777


 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On
 Behalf Of Alexander Graf
 Sent: Thursday, July 18, 2013 8:18 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421; Bhushan
 Bharat-R65777
 Subject: Re: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2
 
 
 This needs a description. Why shouldn't we ignore E?

What I understood is that there is no reason to stop guest setting E, so 
allow him.

-Bharat

 
 
 Alex
 
 On 18.07.2013, at 15:19, Bharat Bhushan wrote:
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  v2:
  - No change
  arch/powerpc/kvm/e500.h |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)
 
  diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h index
  c2e5e98..277cb18 100644
  --- a/arch/powerpc/kvm/e500.h
  +++ b/arch/powerpc/kvm/e500.h
  @@ -117,7 +117,7 @@ static inline struct kvmppc_vcpu_e500
  *to_e500(struct kvm_vcpu *vcpu) #define E500_TLB_USER_PERM_MASK
  (MAS3_UX|MAS3_UR|MAS3_UW) #define E500_TLB_SUPER_PERM_MASK
  (MAS3_SX|MAS3_SR|MAS3_SW) #define MAS2_ATTRIB_MASK \
  - (MAS2_X0 | MAS2_X1)
  + (MAS2_X0 | MAS2_X1 | MAS2_E)
  #define MAS3_ATTRIB_MASK \
(MAS3_U0 | MAS3_U1 | MAS3_U2 | MAS3_U3 \
 | E500_TLB_USER_PERM_MASK | E500_TLB_SUPER_PERM_MASK)
  --
  1.7.0.4
 
 
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body
 of a message to majord...@vger.kernel.org More majordomo info at
 http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Bhushan Bharat-R65777


 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On
 Behalf Of Alexander Graf
 Sent: Thursday, July 18, 2013 8:23 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421; Bhushan
 Bharat-R65777
 Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel
 managed pages
 
 
 On 18.07.2013, at 15:19, Bharat Bhushan wrote:
 
  If there is a struct page for the requested mapping then it's normal
  RAM and the mapping is set to M bit (coherent, cacheable) otherwise
  this is treated as I/O and we set  I + G  (cache inhibited, guarded)
 
  This helps setting proper TLB mapping for direct assigned device
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  v2: some cleanup and added comment
  -
  arch/powerpc/kvm/e500_mmu_host.c |   23 ++-
  1 files changed, 18 insertions(+), 5 deletions(-)
 
  diff --git a/arch/powerpc/kvm/e500_mmu_host.c
  b/arch/powerpc/kvm/e500_mmu_host.c
  index 1c6a9d7..02eb973 100644
  --- a/arch/powerpc/kvm/e500_mmu_host.c
  +++ b/arch/powerpc/kvm/e500_mmu_host.c
  @@ -64,13 +64,26 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int
 usermode)
  return mas3;
  }
 
  -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
  +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
  {
  +   u32 mas2_attr;
  +
  +   mas2_attr = mas2  MAS2_ATTRIB_MASK;
  +
  +   /*
  +* RAM is always mappable on e500 systems, so this is identical
  +* to kvm_is_mmio_pfn(), just without its overhead.
  +*/
  +   if (!pfn_valid(pfn)) {
  +   /* Pages not managed by Kernel are treated as I/O, set I + G */
 
 Please also document the intermediate thought that I/O should be mapped non-
 cached.

I did not get what you mean to document?

 
  +   mas2_attr |= MAS2_I | MAS2_G;
  #ifdef CONFIG_SMP
 
 Please separate the SMP case out of the branch.

Really :) this was looking simple to me.

 
  -   return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
  -#else
  -   return mas2  MAS2_ATTRIB_MASK;
  +   } else {
  +   /* Kernel managed pages are actually RAM so set  M */
 
 This comment doesn't tell me why M can be set ;).

RAM in SMP, so setting coherent, is not that obvious?

-Bharat

 
 
 Alex
 
  +   mas2_attr |= MAS2_M;
  #endif
  +   }
  +   return mas2_attr;
  }
 
  /*
  @@ -313,7 +326,7 @@ static void kvmppc_e500_setup_stlbe(
  /* Force IPROT=0 for all guest mappings. */
  stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID;
  stlbe-mas2 = (gvaddr  MAS2_EPN) |
  - e500_shadow_mas2_attrib(gtlbe-mas2, pr);
  + e500_shadow_mas2_attrib(gtlbe-mas2, pfn);
  stlbe-mas7_3 = ((u64)pfn  PAGE_SHIFT) |
  e500_shadow_mas3_attrib(gtlbe-mas7_3, pr);
 
  --
  1.7.0.4
 
 
  --
  To unsubscribe from this list: send the line unsubscribe kvm-ppc in
  the body of a message to majord...@vger.kernel.org More majordomo info
  at  http://vger.kernel.org/majordomo-info.html
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body
 of a message to majord...@vger.kernel.org More majordomo info at
 http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Alexander Graf

On 18.07.2013, at 17:15, Bhushan Bharat-R65777 wrote:

 
 
 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On
 Behalf Of Alexander Graf
 Sent: Thursday, July 18, 2013 8:23 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421; Bhushan
 Bharat-R65777
 Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel
 managed pages
 
 
 On 18.07.2013, at 15:19, Bharat Bhushan wrote:
 
 If there is a struct page for the requested mapping then it's normal
 RAM and the mapping is set to M bit (coherent, cacheable) otherwise
 this is treated as I/O and we set  I + G  (cache inhibited, guarded)
 
 This helps setting proper TLB mapping for direct assigned device
 
 Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
 ---
 v2: some cleanup and added comment
 -
 arch/powerpc/kvm/e500_mmu_host.c |   23 ++-
 1 files changed, 18 insertions(+), 5 deletions(-)
 
 diff --git a/arch/powerpc/kvm/e500_mmu_host.c
 b/arch/powerpc/kvm/e500_mmu_host.c
 index 1c6a9d7..02eb973 100644
 --- a/arch/powerpc/kvm/e500_mmu_host.c
 +++ b/arch/powerpc/kvm/e500_mmu_host.c
 @@ -64,13 +64,26 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int
 usermode)
 return mas3;
 }
 
 -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
 +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
 {
 +   u32 mas2_attr;
 +
 +   mas2_attr = mas2  MAS2_ATTRIB_MASK;
 +
 +   /*
 +* RAM is always mappable on e500 systems, so this is identical
 +* to kvm_is_mmio_pfn(), just without its overhead.
 +*/
 +   if (!pfn_valid(pfn)) {
 +   /* Pages not managed by Kernel are treated as I/O, set I + G */
 
 Please also document the intermediate thought that I/O should be mapped non-
 cached.
 
 I did not get what you mean to document?

Page snot managed by the kernel are treated as I/O. Map it
with disabled cache.

 
 
 +   mas2_attr |= MAS2_I | MAS2_G;
 #ifdef CONFIG_SMP
 
 Please separate the SMP case out of the branch.
 
 Really :) this was looking simple to me.

Two branches intertwined never look simple :).

 
 
 -   return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
 -#else
 -   return mas2  MAS2_ATTRIB_MASK;
 +   } else {
 +   /* Kernel managed pages are actually RAM so set  M */
 
 This comment doesn't tell me why M can be set ;).
 
 RAM in SMP, so setting coherent, is not that obvious?

No.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2

2013-07-18 Thread Alexander Graf

On 18.07.2013, at 17:12, Bhushan Bharat-R65777 wrote:

 
 
 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On
 Behalf Of Alexander Graf
 Sent: Thursday, July 18, 2013 8:18 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421; Bhushan
 Bharat-R65777
 Subject: Re: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2
 
 
 This needs a description. Why shouldn't we ignore E?
 
 What I understood is that there is no reason to stop guest setting E, so 
 allow him.

Please add that to the patch description. Also explain what the bit means.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2

2013-07-18 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Thursday, July 18, 2013 8:50 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421
 Subject: Re: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2
 
 
 On 18.07.2013, at 17:12, Bhushan Bharat-R65777 wrote:
 
 
 
  -Original Message-
  From: kvm-ppc-ow...@vger.kernel.org
  [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf
  Sent: Thursday, July 18, 2013 8:18 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421;
  Bhushan
  Bharat-R65777
  Subject: Re: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute
  in mas2
 
 
  This needs a description. Why shouldn't we ignore E?
 
  What I understood is that there is no reason to stop guest setting E, so
 allow him.
 
 Please add that to the patch description. Also explain what the bit means.

Ok :)

-Bharat
 
 
 Alex
 


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Scott Wood

On 07/18/2013 05:00:42 AM, Alexander Graf wrote:
Now why is setting invalid flags a problem? If I understand Scott  
correctly, it can break the host if you access certain host devices  
with caching enabled. But to be sure I'd say we ask him directly :).


The architecture makes it illegal to mix cacheable and cache-inhibited  
mappings to the same physical page.  Mixing W or M bits is generally  
bad as well.  I've seen it cause machine checks, error interrupts, etc.  
-- not just corrupting the page in question.


-Scott
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2

2013-07-18 Thread Scott Wood

On 07/18/2013 10:12:30 AM, Bhushan Bharat-R65777 wrote:



 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org  
[mailto:kvm-ppc-ow...@vger.kernel.org] On

 Behalf Of Alexander Graf
 Sent: Thursday, July 18, 2013 8:18 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood  
Scott-B07421; Bhushan

 Bharat-R65777
 Subject: Re: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E  
attribute in mas2



 This needs a description. Why shouldn't we ignore E?

What I understood is that there is no reason to stop guest setting  
E, so allow him.


FWIW, it'd probably be safe to let the guest control the G bit as well.

-Scott
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Scott Wood

On 07/18/2013 08:19:03 AM, Bharat Bhushan wrote:

If there is a struct page for the requested mapping then it's
normal RAM and the mapping is set to M bit (coherent, cacheable)
otherwise this is treated as I/O and we set  I + G  (cache  
inhibited, guarded)


This helps setting proper TLB mapping for direct assigned device

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v2: some cleanup and added comment
 -
 arch/powerpc/kvm/e500_mmu_host.c |   23 ++-
 1 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/e500_mmu_host.c  
b/arch/powerpc/kvm/e500_mmu_host.c

index 1c6a9d7..02eb973 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -64,13 +64,26 @@ static inline u32 e500_shadow_mas3_attrib(u32  
mas3, int usermode)

return mas3;
 }

-static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
+static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
 {
+   u32 mas2_attr;
+
+   mas2_attr = mas2  MAS2_ATTRIB_MASK;
+
+   /*
+* RAM is always mappable on e500 systems, so this is identical
+* to kvm_is_mmio_pfn(), just without its overhead.
+*/
+   if (!pfn_valid(pfn)) {


Please use page_is_ram(), which is what gets used when setting the WIMG  
for the host userspace mapping.  We want to make sure the two are  
consistent.


+		/* Pages not managed by Kernel are treated as I/O, set  
I + G */

+   mas2_attr |= MAS2_I | MAS2_G;
 #ifdef CONFIG_SMP
-   return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
-#else
-   return mas2  MAS2_ATTRIB_MASK;
+   } else {
+   /* Kernel managed pages are actually RAM so set  M */
+   mas2_attr |= MAS2_M;
 #endif


Likewise, we want to make sure this matches the host entry.   
Unfortunately, this is a bit of a mess already.  64-bit booke appears  
to always set MAS2_M for TLB0 mappings.  The initial KERNELBASE mapping  
on boot uses M_IF_SMP, and the settlbcam() that (IIRC) replaces it uses  
_PAGE_COHERENT.  32-bit always uses _PAGE_COHERENT, except that initial  
KERNELBASE mapping.  _PAGE_COHERENT appears to be set based on  
CONFIG_SMP || CONFIG_PPC_STD_MMU (the latter config clears  
_PAGE_COHERENT in the non-CPU_FTR_NEED_COHERENT case).


As for what we actually want to happen, there are cases when we want M  
to be set for non-SMP.  One such case is AMP, where CPUs may be sharing  
memory even if the Linux instance only runs on one CPU (this is not  
hypothetical, BTW).  It's also possible that we encounter a hardware  
bug that requires MAS2_M, similar to what some of our non-booke chips  
require.


-Scott
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Alexander Graf

On 18.07.2013, at 19:17, Scott Wood wrote:

 On 07/18/2013 08:19:03 AM, Bharat Bhushan wrote:
 If there is a struct page for the requested mapping then it's
 normal RAM and the mapping is set to M bit (coherent, cacheable)
 otherwise this is treated as I/O and we set  I + G  (cache inhibited, 
 guarded)
 This helps setting proper TLB mapping for direct assigned device
 Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
 ---
 v2: some cleanup and added comment
 -
 arch/powerpc/kvm/e500_mmu_host.c |   23 ++-
 1 files changed, 18 insertions(+), 5 deletions(-)
 diff --git a/arch/powerpc/kvm/e500_mmu_host.c 
 b/arch/powerpc/kvm/e500_mmu_host.c
 index 1c6a9d7..02eb973 100644
 --- a/arch/powerpc/kvm/e500_mmu_host.c
 +++ b/arch/powerpc/kvm/e500_mmu_host.c
 @@ -64,13 +64,26 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int 
 usermode)
  return mas3;
 }
 -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
 +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
 {
 +u32 mas2_attr;
 +
 +mas2_attr = mas2  MAS2_ATTRIB_MASK;
 +
 +/*
 + * RAM is always mappable on e500 systems, so this is identical
 + * to kvm_is_mmio_pfn(), just without its overhead.
 + */
 +if (!pfn_valid(pfn)) {
 
 Please use page_is_ram(), which is what gets used when setting the WIMG for 
 the host userspace mapping.  We want to make sure the two are consistent.
 
 +/* Pages not managed by Kernel are treated as I/O, set I + G */
 +mas2_attr |= MAS2_I | MAS2_G;
 #ifdef CONFIG_SMP
 -return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
 -#else
 -return mas2  MAS2_ATTRIB_MASK;
 +} else {
 +/* Kernel managed pages are actually RAM so set  M */
 +mas2_attr |= MAS2_M;
 #endif
 
 Likewise, we want to make sure this matches the host entry.  Unfortunately, 
 this is a bit of a mess already.  64-bit booke appears to always set MAS2_M 
 for TLB0 mappings.  The initial KERNELBASE mapping on boot uses M_IF_SMP, and 
 the settlbcam() that (IIRC) replaces it uses _PAGE_COHERENT.  32-bit always 
 uses _PAGE_COHERENT, except that initial KERNELBASE mapping.  _PAGE_COHERENT 
 appears to be set based on CONFIG_SMP || CONFIG_PPC_STD_MMU (the latter 
 config clears _PAGE_COHERENT in the non-CPU_FTR_NEED_COHERENT case).
 
 As for what we actually want to happen, there are cases when we want M to be 
 set for non-SMP.  One such case is AMP, where CPUs may be sharing memory even 
 if the Linux instance only runs on one CPU (this is not hypothetical, BTW).  
 It's also possible that we encounter a hardware bug that requires MAS2_M, 
 similar to what some of our non-booke chips require.

How about we always set M then for RAM?


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Scott Wood

On 07/18/2013 12:32:18 PM, Alexander Graf wrote:


On 18.07.2013, at 19:17, Scott Wood wrote:

 On 07/18/2013 08:19:03 AM, Bharat Bhushan wrote:
 Likewise, we want to make sure this matches the host entry.   
Unfortunately, this is a bit of a mess already.  64-bit booke appears  
to always set MAS2_M for TLB0 mappings.  The initial KERNELBASE  
mapping on boot uses M_IF_SMP, and the settlbcam() that (IIRC)  
replaces it uses _PAGE_COHERENT.  32-bit always uses _PAGE_COHERENT,  
except that initial KERNELBASE mapping.  _PAGE_COHERENT appears to be  
set based on CONFIG_SMP || CONFIG_PPC_STD_MMU (the latter config  
clears _PAGE_COHERENT in the non-CPU_FTR_NEED_COHERENT case).


 As for what we actually want to happen, there are cases when we  
want M to be set for non-SMP.  One such case is AMP, where CPUs may  
be sharing memory even if the Linux instance only runs on one CPU  
(this is not hypothetical, BTW).  It's also possible that we  
encounter a hardware bug that requires MAS2_M, similar to what some  
of our non-booke chips require.


How about we always set M then for RAM?


M is like I in that bad things happen if you mix them.  So we really  
want to match exactly what the rest of the kernel is doing.


Plus, the performance penalty on some single-core chips can be pretty  
bad.


-Scott
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH] kvm-unit-tests : Basic architecture of VMX nested test case

2013-07-18 Thread Gleb Natapov
On Thu, Jul 18, 2013 at 02:08:51PM +0200, Paolo Bonzini wrote:
 Il 18/07/2013 13:06, Gleb Natapov ha scritto:
  On Thu, Jul 18, 2013 at 12:47:46PM +0200, Paolo Bonzini wrote:
  and for a testsuite I'd prefer the latter---which means I'd still favor
  setjmp/longjmp.
 
  Now, here is the long explanation.
 
  I must admit that the code looks nice.  There are some nits I'd like to
  see done differently (such as putting vmx_return at the beginning of the
  while (1), and the vmresume asm at the end), but it is indeed nice.
 
  Why do you prefer setjmp/longjmp then?
 
  Because it is still deceiving, and I dislike the deceit more than I like
  the linear code flow.
 
  What is deceiving about it? Of course for someone who has no idea how
  vmx works the code will not be obvious, but why should we care. For
  someone who knows what is deceiving about returning into the same
  function guest was launched from by using VMX mechanism
 
 The way the code is written is deceiving.  If I see
 
   asm(vmlaunch; seta %0)
   while (ret)
 
 I expect HOST_RIP to point at the seta or somewhere near, not at a
 completely different label somewhere else.
 
Why would you expect that assuming you know what vmlaunch is? So it is OK
when HOST_RIP point to somewhere outside a function, but deceiving if
it point 3 lines down in the same function? I have a hard time following
this logic.

  instead of longjmp()?
 
 Look again at the setjmp/longjmp version.  longjmp is not used to handle
 vmexit.  It is used to jump back from the vmexit handler to main, which
 is exactly what setjmp/longjmp is meant for.
 
That's because simple return will not work in that version, this is
artifact of how vmexit was done.

  It is still somewhat magic: the while (ret) is only there to please
  the compiler
 
  No, it it there to catch vmlaunch/vmresume errors.
 
 You could change it to while (0) and it would still work.  That's what
 I mean by only there to please the compiler.
But while (1) will not, so the code is executed, it is not just for
compiler there, but it is executed only if vmlaunch/vmresume fails.

 
  the compiler, and you rely on the compiler not changing %rsp between the
  vmlaunch and the vmx_return label.  Minor nit, you cannot anymore print
  HOST_RSP should be loaded on each guest entry.
 
 Right, but my point is: without a big asm blob, you don't know the right
 value to load.  It depends on the generated code.  And the big asm blob
 limits a lot the code looks nice value of this approach.
 
I said it number of times already, this is not about code looking nice,
code looks like KVM or use less assembler as possible, this is about
linear data flow. It is not fun chasing code execution path. Yes, you
can argue that vmlaunch/vmresume inherently non linear, but there is a
difference between skipping one instruction and remain in the same
function and on the same stack, or jump completely to a different
context.

Speaking about KVM. Guest enter/exit assembly blob is around ~50 lines
if assembly code and more then half of that is saving restoring context.
And the code plays some tricks there for optimization that we do not
need to do here, so I expect the code to be even smaller, not much more
then 10 lines of assembly excluding state save/restore.

  different error messages for vmlaunch vs. vmresume failure.
  Just because the same variable is used to store error values :)
  Make vmlaunch use err1 and vmresume err2 and do while (!err1  !err2)
 
 Yeah. :)
 
  In the end the choice is between all in asm and small asm trampoline
  (which also happens to use setjmp/longjmp, but perhaps Arthur can
  propagate return codes without using setjmp/longjmp, too).
 
  Rewriting the whole guest entry exit path in asm like you suggest bellow
  will eliminate the magic too.
 
  I much prefer one big asm statement than many small asm statement
  scattered among two or three C lines.
 
  It's not many asm statements, it's just a dozen lines:
 
  I am not talking about overall file, but the how vmx_run() is written:
  asm()
  C code
  asm()
  C code
  ...
  
  I much prefer:
  C code
  
  big asm() blob
  
  C code.
 
 Me too.  But the trampoline version is neither, it is
 
 static inline void vmlaunch() { asm(vmlaunch) }
 static inline void vmresume() { asm(vmresume) }
 small asm() blob
I is small only because context save restore is hidden behind
macro and 4 asm instructions (vmlaunch/seta/vmresume/seta) a hidden
somewhere else. The actually differences in asm instruction between both
version will not be bigger then a couple of lines (if we will not take
setjmp/longjmp implementation into account :)), but I do not even see
why we discuss this argument since minimizing asm instructions here is
not an point. We should not use more then needed to achieve the goal of
course, but design should not be around number of assembly lines.

 C code
 
 Paolo

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in

[PATCH] perf, kvm: Support the in_tx/in_tx_cp modifiers in KVM arch perfmon emulation v5

2013-07-18 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com

[KVM maintainers:
The underlying support for this is in perf/core now. So please merge
this patch into the KVM tree.]

This is not arch perfmon, but older CPUs will just ignore it. This makes
it possible to do at least some TSX measurements from a KVM guest

v2: Various fixes to address review feedback
v3: Ignore the bits when no CPUID. No #GP. Force raw events with TSX bits.
v4: Use reserved bits for #GP
v5: Remove obsolete argument
Acked-by: Gleb Natapov g...@redhat.com
Signed-off-by: Andi Kleen a...@linux.intel.com
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/pmu.c  | 25 -
 2 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index af9c552..01493a1 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -315,6 +315,7 @@ struct kvm_pmu {
u64 global_ovf_ctrl;
u64 counter_bitmask[2];
u64 global_ctrl_mask;
+   u64 reserved_bits;
u8 version;
struct kvm_pmc gp_counters[INTEL_PMC_MAX_GENERIC];
struct kvm_pmc fixed_counters[INTEL_PMC_MAX_FIXED];
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index c53e797..5c4f631 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -160,7 +160,7 @@ static void stop_counter(struct kvm_pmc *pmc)
 
 static void reprogram_counter(struct kvm_pmc *pmc, u32 type,
unsigned config, bool exclude_user, bool exclude_kernel,
-   bool intr)
+   bool intr, bool in_tx, bool in_tx_cp)
 {
struct perf_event *event;
struct perf_event_attr attr = {
@@ -173,6 +173,10 @@ static void reprogram_counter(struct kvm_pmc *pmc, u32 
type,
.exclude_kernel = exclude_kernel,
.config = config,
};
+   if (in_tx)
+   attr.config |= HSW_IN_TX;
+   if (in_tx_cp)
+   attr.config |= HSW_IN_TX_CHECKPOINTED;
 
attr.sample_period = (-pmc-counter)  pmc_bitmask(pmc);
 
@@ -226,7 +230,9 @@ static void reprogram_gp_counter(struct kvm_pmc *pmc, u64 
eventsel)
 
if (!(eventsel  (ARCH_PERFMON_EVENTSEL_EDGE |
ARCH_PERFMON_EVENTSEL_INV |
-   ARCH_PERFMON_EVENTSEL_CMASK))) {
+   ARCH_PERFMON_EVENTSEL_CMASK |
+   HSW_IN_TX |
+   HSW_IN_TX_CHECKPOINTED))) {
config = find_arch_event(pmc-vcpu-arch.pmu, event_select,
unit_mask);
if (config != PERF_COUNT_HW_MAX)
@@ -239,7 +245,9 @@ static void reprogram_gp_counter(struct kvm_pmc *pmc, u64 
eventsel)
reprogram_counter(pmc, type, config,
!(eventsel  ARCH_PERFMON_EVENTSEL_USR),
!(eventsel  ARCH_PERFMON_EVENTSEL_OS),
-   eventsel  ARCH_PERFMON_EVENTSEL_INT);
+   eventsel  ARCH_PERFMON_EVENTSEL_INT,
+   (eventsel  HSW_IN_TX),
+   (eventsel  HSW_IN_TX_CHECKPOINTED));
 }
 
 static void reprogram_fixed_counter(struct kvm_pmc *pmc, u8 en_pmi, int idx)
@@ -256,7 +264,7 @@ static void reprogram_fixed_counter(struct kvm_pmc *pmc, u8 
en_pmi, int idx)
arch_events[fixed_pmc_events[idx]].event_type,
!(en  0x2), /* exclude user */
!(en  0x1), /* exclude kernel */
-   pmi);
+   pmi, false, false);
 }
 
 static inline u8 fixed_en_pmi(u64 ctrl, int idx)
@@ -408,7 +416,7 @@ int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data 
*msr_info)
} else if ((pmc = get_gp_pmc(pmu, index, MSR_P6_EVNTSEL0))) {
if (data == pmc-eventsel)
return 0;
-   if (!(data  0x0020ull)) {
+   if (!(data  pmu-reserved_bits)) {
reprogram_gp_counter(pmc, data);
return 0;
}
@@ -450,6 +458,7 @@ void kvm_pmu_cpuid_update(struct kvm_vcpu *vcpu)
pmu-counter_bitmask[KVM_PMC_GP] = 0;
pmu-counter_bitmask[KVM_PMC_FIXED] = 0;
pmu-version = 0;
+   pmu-reserved_bits = 0x0020ull;
 
entry = kvm_find_cpuid_entry(vcpu, 0xa, 0);
if (!entry)
@@ -478,6 +487,12 @@ void kvm_pmu_cpuid_update(struct kvm_vcpu *vcpu)
pmu-global_ctrl = ((1  pmu-nr_arch_gp_counters) - 1) |
(((1ull  pmu-nr_arch_fixed_counters) - 1)  
INTEL_PMC_IDX_FIXED);
pmu-global_ctrl_mask = ~pmu-global_ctrl;
+
+   entry = kvm_find_cpuid_entry(vcpu, 7, 0);
+   if (entry 
+   (boot_cpu_has(X86_FEATURE_HLE) || boot_cpu_has(X86_FEATURE_RTM)) 
+   (entry-ebx  (X86_FEATURE_HLE|X86_FEATURE_RTM)))
+   

Re: [PATCH 00/10 v6] KVM: PPC: IOMMU in-kernel handling

2013-07-18 Thread Alexey Kardashevskiy
On 07/16/2013 10:53 AM, Alexey Kardashevskiy wrote:
 The changes are:
 1. rebased on v3.11-rc1 so the capability numbers changed again
 2. fixed multiple comments from maintainers
 3. KVM: PPC: Add support for IOMMU in-kernel handling is split into
 2 patches, the new one is powerpc/iommu: rework to support realmode.
 4. IOMMU_API is now always enabled for KVM_BOOK3S_64.
 
 MOre details in the individual patch comments.
 
 Depends on hashtable: add hash_for_each_possible_rcu_notrace(),
 posted a while ago.
 
 
 Alexey Kardashevskiy (10):
   KVM: PPC: reserve a capability number for multitce support
   KVM: PPC: reserve a capability and ioctl numbers for realmode VFIO



Alex, could you please pull these 2 patches or tell what is wrong with
them? Having them sooner in the kernel would let me ask for a headers
update for QEMU and then I would try pushing miltiple TCE and VFIO support
in QEMU. Thanks.




-- 
Alexey
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] kvm: powerpc: Do not ignore E attribute in mas2

2013-07-18 Thread Bharat Bhushan
Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 arch/powerpc/kvm/e500.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h
index c2e5e98..277cb18 100644
--- a/arch/powerpc/kvm/e500.h
+++ b/arch/powerpc/kvm/e500.h
@@ -117,7 +117,7 @@ static inline struct kvmppc_vcpu_e500 *to_e500(struct 
kvm_vcpu *vcpu)
 #define E500_TLB_USER_PERM_MASK (MAS3_UX|MAS3_UR|MAS3_UW)
 #define E500_TLB_SUPER_PERM_MASK (MAS3_SX|MAS3_SR|MAS3_SW)
 #define MAS2_ATTRIB_MASK \
- (MAS2_X0 | MAS2_X1)
+ (MAS2_X0 | MAS2_X1 | MAS2_E)
 #define MAS3_ATTRIB_MASK \
  (MAS3_U0 | MAS3_U1 | MAS3_U2 | MAS3_U3 \
   | E500_TLB_USER_PERM_MASK | E500_TLB_SUPER_PERM_MASK)
-- 
1.7.0.4


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Bharat Bhushan
If there is a struct page for the requested mapping then it's
normal DDR and the mapping sets M bit (coherent, cacheable)
else this is treated as I/O and we set  I + G  (cache inhibited, guarded)

This helps setting proper TLB mapping for direct assigned device

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 arch/powerpc/kvm/e500_mmu_host.c |   17 -
 1 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
index 1c6a9d7..089c227 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int 
usermode)
return mas3;
 }
 
-static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
+static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
 {
+   u32 mas2_attr;
+
+   mas2_attr = mas2  MAS2_ATTRIB_MASK;
+
+   if (!pfn_valid(pfn)) {
+   mas2_attr |= MAS2_I | MAS2_G;
+   } else {
 #ifdef CONFIG_SMP
-   return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
-#else
-   return mas2  MAS2_ATTRIB_MASK;
+   mas2_attr |= MAS2_M;
 #endif
+   }
+   return mas2_attr;
 }
 
 /*
@@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe(
/* Force IPROT=0 for all guest mappings. */
stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID;
stlbe-mas2 = (gvaddr  MAS2_EPN) |
- e500_shadow_mas2_attrib(gtlbe-mas2, pr);
+ e500_shadow_mas2_attrib(gtlbe-mas2, pfn);
stlbe-mas7_3 = ((u64)pfn  PAGE_SHIFT) |
e500_shadow_mas3_attrib(gtlbe-mas7_3, pr);
 
-- 
1.7.0.4


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread “tiejun.chen”

On 07/18/2013 02:04 PM, Bharat Bhushan wrote:

If there is a struct page for the requested mapping then it's
normal DDR and the mapping sets M bit (coherent, cacheable)
else this is treated as I/O and we set  I + G  (cache inhibited, guarded)

This helps setting proper TLB mapping for direct assigned device

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
  arch/powerpc/kvm/e500_mmu_host.c |   17 -
  1 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
index 1c6a9d7..089c227 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int 
usermode)
return mas3;
  }

-static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
+static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
  {
+   u32 mas2_attr;
+
+   mas2_attr = mas2  MAS2_ATTRIB_MASK;
+
+   if (!pfn_valid(pfn)) {


Why not directly use kvm_is_mmio_pfn()?

Tiejun


+   mas2_attr |= MAS2_I | MAS2_G;
+   } else {
  #ifdef CONFIG_SMP
-   return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
-#else
-   return mas2  MAS2_ATTRIB_MASK;
+   mas2_attr |= MAS2_M;
  #endif
+   }
+   return mas2_attr;
  }

  /*
@@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe(
/* Force IPROT=0 for all guest mappings. */
stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID;
stlbe-mas2 = (gvaddr  MAS2_EPN) |
- e500_shadow_mas2_attrib(gtlbe-mas2, pr);
+ e500_shadow_mas2_attrib(gtlbe-mas2, pfn);
stlbe-mas7_3 = ((u64)pfn  PAGE_SHIFT) |
e500_shadow_mas3_attrib(gtlbe-mas7_3, pr);




--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Bhushan Bharat-R65777


 -Original Message-
 From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
 Sent: Thursday, July 18, 2013 11:56 AM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott-
 B07421; Bhushan Bharat-R65777
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
 managed pages
 
 On 07/18/2013 02:04 PM, Bharat Bhushan wrote:
  If there is a struct page for the requested mapping then it's normal
  DDR and the mapping sets M bit (coherent, cacheable) else this is
  treated as I/O and we set  I + G  (cache inhibited, guarded)
 
  This helps setting proper TLB mapping for direct assigned device
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
arch/powerpc/kvm/e500_mmu_host.c |   17 -
1 files changed, 12 insertions(+), 5 deletions(-)
 
  diff --git a/arch/powerpc/kvm/e500_mmu_host.c
  b/arch/powerpc/kvm/e500_mmu_host.c
  index 1c6a9d7..089c227 100644
  --- a/arch/powerpc/kvm/e500_mmu_host.c
  +++ b/arch/powerpc/kvm/e500_mmu_host.c
  @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int
 usermode)
  return mas3;
}
 
  -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
  +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
{
  +   u32 mas2_attr;
  +
  +   mas2_attr = mas2  MAS2_ATTRIB_MASK;
  +
  +   if (!pfn_valid(pfn)) {
 
 Why not directly use kvm_is_mmio_pfn()?

What I understand from this function (someone can correct me) is that it 
returns false when the page is managed by kernel and is not marked as 
RESERVED (for some reason). For us it does not matter whether the page is 
reserved or not, if it is kernel visible page then it is DDR.

-Bharat

 
 Tiejun
 
  +   mas2_attr |= MAS2_I | MAS2_G;
  +   } else {
#ifdef CONFIG_SMP
  -   return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
  -#else
  -   return mas2  MAS2_ATTRIB_MASK;
  +   mas2_attr |= MAS2_M;
#endif
  +   }
  +   return mas2_attr;
}
 
/*
  @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe(
  /* Force IPROT=0 for all guest mappings. */
  stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID;
  stlbe-mas2 = (gvaddr  MAS2_EPN) |
  - e500_shadow_mas2_attrib(gtlbe-mas2, pr);
  + e500_shadow_mas2_attrib(gtlbe-mas2, pfn);
  stlbe-mas7_3 = ((u64)pfn  PAGE_SHIFT) |
  e500_shadow_mas3_attrib(gtlbe-mas7_3, pr);
 
 
 

N�r��yb�X��ǧv�^�)޺{.n�+jir)w*jg����ݢj/���z�ޖ��2�ޙ�)ߡ�a�����G���h��j:+v���w��٥

Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread “tiejun.chen”

On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:




-Original Message-
From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
Sent: Thursday, July 18, 2013 11:56 AM
To: Bhushan Bharat-R65777
Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott-
B07421; Bhushan Bharat-R65777
Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
managed pages

On 07/18/2013 02:04 PM, Bharat Bhushan wrote:

If there is a struct page for the requested mapping then it's normal
DDR and the mapping sets M bit (coherent, cacheable) else this is
treated as I/O and we set  I + G  (cache inhibited, guarded)

This helps setting proper TLB mapping for direct assigned device

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
   arch/powerpc/kvm/e500_mmu_host.c |   17 -
   1 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/e500_mmu_host.c
b/arch/powerpc/kvm/e500_mmu_host.c
index 1c6a9d7..089c227 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int

usermode)

return mas3;
   }

-static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
+static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
   {
+   u32 mas2_attr;
+
+   mas2_attr = mas2  MAS2_ATTRIB_MASK;
+
+   if (!pfn_valid(pfn)) {


Why not directly use kvm_is_mmio_pfn()?


What I understand from this function (someone can correct me) is that it returns 
false when the page is managed by kernel and is not marked as RESERVED (for 
some reason). For us it does not matter whether the page is reserved or not, if it is 
kernel visible page then it is DDR.



I think you are setting I|G by addressing all mmio pages, right? If so,

KVM: direct mmio pfn check

Userspace may specify memory slots that are backed by mmio pages rather than
normal RAM.  In some cases it is not enough to identify these mmio pages
by pfn_valid().  This patch adds checking the PageReserved as well.

Tiejun


-Bharat



Tiejun


+   mas2_attr |= MAS2_I | MAS2_G;
+   } else {
   #ifdef CONFIG_SMP
-   return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
-#else
-   return mas2  MAS2_ATTRIB_MASK;
+   mas2_attr |= MAS2_M;
   #endif
+   }
+   return mas2_attr;
   }

   /*
@@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe(
/* Force IPROT=0 for all guest mappings. */
stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID;
stlbe-mas2 = (gvaddr  MAS2_EPN) |
- e500_shadow_mas2_attrib(gtlbe-mas2, pr);
+ e500_shadow_mas2_attrib(gtlbe-mas2, pfn);
stlbe-mas7_3 = ((u64)pfn  PAGE_SHIFT) |
e500_shadow_mas3_attrib(gtlbe-mas7_3, pr);








--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Bhushan Bharat-R65777


 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On
 Behalf Of “tiejun.chen”
 Sent: Thursday, July 18, 2013 1:01 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott-
 B07421
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
 managed pages
 
 On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:
 
 
  -Original Message-
  From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
  Sent: Thursday, July 18, 2013 11:56 AM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood
  Scott- B07421; Bhushan Bharat-R65777
  Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
  kernel managed pages
 
  On 07/18/2013 02:04 PM, Bharat Bhushan wrote:
  If there is a struct page for the requested mapping then it's normal
  DDR and the mapping sets M bit (coherent, cacheable) else this is
  treated as I/O and we set  I + G  (cache inhibited, guarded)
 
  This helps setting proper TLB mapping for direct assigned device
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
 arch/powerpc/kvm/e500_mmu_host.c |   17 -
 1 files changed, 12 insertions(+), 5 deletions(-)
 
  diff --git a/arch/powerpc/kvm/e500_mmu_host.c
  b/arch/powerpc/kvm/e500_mmu_host.c
  index 1c6a9d7..089c227 100644
  --- a/arch/powerpc/kvm/e500_mmu_host.c
  +++ b/arch/powerpc/kvm/e500_mmu_host.c
  @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32
  mas3, int
  usermode)
return mas3;
 }
 
  -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
  +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
 {
  + u32 mas2_attr;
  +
  + mas2_attr = mas2  MAS2_ATTRIB_MASK;
  +
  + if (!pfn_valid(pfn)) {
 
  Why not directly use kvm_is_mmio_pfn()?
 
  What I understand from this function (someone can correct me) is that it
 returns false when the page is managed by kernel and is not marked as 
 RESERVED
 (for some reason). For us it does not matter whether the page is reserved or
 not, if it is kernel visible page then it is DDR.
 
 
 I think you are setting I|G by addressing all mmio pages, right? If so,
 
  KVM: direct mmio pfn check
 
  Userspace may specify memory slots that are backed by mmio pages rather
 than
  normal RAM.  In some cases it is not enough to identify these mmio pages
  by pfn_valid().  This patch adds checking the PageReserved as well.

Do you know what are those some cases and how checking PageReserved helps in 
those cases?

-Bharat

 
 Tiejun
 
  -Bharat
 
 
  Tiejun
 
  + mas2_attr |= MAS2_I | MAS2_G;
  + } else {
 #ifdef CONFIG_SMP
  - return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
  -#else
  - return mas2  MAS2_ATTRIB_MASK;
  + mas2_attr |= MAS2_M;
 #endif
  + }
  + return mas2_attr;
 }
 
 /*
  @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe(
/* Force IPROT=0 for all guest mappings. */
stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | 
  MAS1_VALID;
stlbe-mas2 = (gvaddr  MAS2_EPN) |
  -   e500_shadow_mas2_attrib(gtlbe-mas2, pr);
  +   e500_shadow_mas2_attrib(gtlbe-mas2, pfn);
stlbe-mas7_3 = ((u64)pfn  PAGE_SHIFT) |
e500_shadow_mas3_attrib(gtlbe-mas7_3, pr);
 
 
 
 
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body
 of a message to majord...@vger.kernel.org More majordomo info at
 http://vger.kernel.org/majordomo-info.html

N�r��yb�X��ǧv�^�)޺{.n�+jir)w*jg����ݢj/���z�ޖ��2�ޙ�)ߡ�a�����G���h��j:+v���w��٥

Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread “tiejun.chen”

On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote:




-Original Message-
From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On
Behalf Of “tiejun.chen”
Sent: Thursday, July 18, 2013 1:01 PM
To: Bhushan Bharat-R65777
Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott-
B07421
Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
managed pages

On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:




-Original Message-
From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
Sent: Thursday, July 18, 2013 11:56 AM
To: Bhushan Bharat-R65777
Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood
Scott- B07421; Bhushan Bharat-R65777
Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
kernel managed pages

On 07/18/2013 02:04 PM, Bharat Bhushan wrote:

If there is a struct page for the requested mapping then it's normal
DDR and the mapping sets M bit (coherent, cacheable) else this is
treated as I/O and we set  I + G  (cache inhibited, guarded)

This helps setting proper TLB mapping for direct assigned device

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
arch/powerpc/kvm/e500_mmu_host.c |   17 -
1 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/e500_mmu_host.c
b/arch/powerpc/kvm/e500_mmu_host.c
index 1c6a9d7..089c227 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32
mas3, int

usermode)

return mas3;
}

-static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
+static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
{
+   u32 mas2_attr;
+
+   mas2_attr = mas2  MAS2_ATTRIB_MASK;
+
+   if (!pfn_valid(pfn)) {


Why not directly use kvm_is_mmio_pfn()?


What I understand from this function (someone can correct me) is that it

returns false when the page is managed by kernel and is not marked as RESERVED
(for some reason). For us it does not matter whether the page is reserved or
not, if it is kernel visible page then it is DDR.




I think you are setting I|G by addressing all mmio pages, right? If so,

  KVM: direct mmio pfn check

  Userspace may specify memory slots that are backed by mmio pages rather
than
  normal RAM.  In some cases it is not enough to identify these mmio pages
  by pfn_valid().  This patch adds checking the PageReserved as well.


Do you know what are those some cases and how checking PageReserved helps in 
those cases?


No, myself didn't see these actual cases in qemu,too. But this should be 
chronically persistent as I understand ;-)


Tiejun



-Bharat



Tiejun


-Bharat



Tiejun


+   mas2_attr |= MAS2_I | MAS2_G;
+   } else {
#ifdef CONFIG_SMP
-   return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
-#else
-   return mas2  MAS2_ATTRIB_MASK;
+   mas2_attr |= MAS2_M;
#endif
+   }
+   return mas2_attr;
}

/*
@@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe(
/* Force IPROT=0 for all guest mappings. */
stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID;
stlbe-mas2 = (gvaddr  MAS2_EPN) |
- e500_shadow_mas2_attrib(gtlbe-mas2, pr);
+ e500_shadow_mas2_attrib(gtlbe-mas2, pfn);
stlbe-mas7_3 = ((u64)pfn  PAGE_SHIFT) |
e500_shadow_mas3_attrib(gtlbe-mas7_3, pr);








--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body
of a message to majord...@vger.kernel.org More majordomo info at
http://vger.kernel.org/majordomo-info.html




--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Bhushan Bharat-R65777


 -Original Message-
 From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
 Sent: Thursday, July 18, 2013 1:52 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott-
 B07421
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
 managed pages
 
 On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote:
 
 
  -Original Message-
  From: kvm-ppc-ow...@vger.kernel.org
  [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen”
  Sent: Thursday, July 18, 2013 1:01 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood
  Scott-
  B07421
  Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
  kernel managed pages
 
  On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:
 
 
  -Original Message-
  From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
  Sent: Thursday, July 18, 2013 11:56 AM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de;
  Wood
  Scott- B07421; Bhushan Bharat-R65777
  Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
  kernel managed pages
 
  On 07/18/2013 02:04 PM, Bharat Bhushan wrote:
  If there is a struct page for the requested mapping then it's
  normal DDR and the mapping sets M bit (coherent, cacheable) else
  this is treated as I/O and we set  I + G  (cache inhibited,
  guarded)
 
  This helps setting proper TLB mapping for direct assigned device
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  arch/powerpc/kvm/e500_mmu_host.c |   17 -
  1 files changed, 12 insertions(+), 5 deletions(-)
 
  diff --git a/arch/powerpc/kvm/e500_mmu_host.c
  b/arch/powerpc/kvm/e500_mmu_host.c
  index 1c6a9d7..089c227 100644
  --- a/arch/powerpc/kvm/e500_mmu_host.c
  +++ b/arch/powerpc/kvm/e500_mmu_host.c
  @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32
  mas3, int
  usermode)
  return mas3;
  }
 
  -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
  +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
  {
  +   u32 mas2_attr;
  +
  +   mas2_attr = mas2  MAS2_ATTRIB_MASK;
  +
  +   if (!pfn_valid(pfn)) {
 
  Why not directly use kvm_is_mmio_pfn()?
 
  What I understand from this function (someone can correct me) is
  that it
  returns false when the page is managed by kernel and is not marked
  as RESERVED (for some reason). For us it does not matter whether the
  page is reserved or not, if it is kernel visible page then it is DDR.
 
 
  I think you are setting I|G by addressing all mmio pages, right? If
  so,
 
KVM: direct mmio pfn check
 
Userspace may specify memory slots that are backed by mmio
  pages rather than
normal RAM.  In some cases it is not enough to identify these mmio
 pages
by pfn_valid().  This patch adds checking the PageReserved as well.
 
  Do you know what are those some cases and how checking PageReserved helps 
  in
 those cases?
 
 No, myself didn't see these actual cases in qemu,too. But this should be
 chronically persistent as I understand ;-)

Then I will wait till someone educate me :)

-Bharat

 
 Tiejun
 
 
  -Bharat
 
 
  Tiejun
 
  -Bharat
 
 
  Tiejun
 
  +   mas2_attr |= MAS2_I | MAS2_G;
  +   } else {
  #ifdef CONFIG_SMP
  -   return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
  -#else
  -   return mas2  MAS2_ATTRIB_MASK;
  +   mas2_attr |= MAS2_M;
  #endif
  +   }
  +   return mas2_attr;
  }
 
  /*
  @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe(
  /* Force IPROT=0 for all guest mappings. */
  stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | 
  MAS1_VALID;
  stlbe-mas2 = (gvaddr  MAS2_EPN) |
  - e500_shadow_mas2_attrib(gtlbe-mas2, pr);
  + e500_shadow_mas2_attrib(gtlbe-mas2, pfn);
  stlbe-mas7_3 = ((u64)pfn  PAGE_SHIFT) |
  e500_shadow_mas3_attrib(gtlbe-mas7_3, pr);
 
 
 
 
 
  --
  To unsubscribe from this list: send the line unsubscribe kvm-ppc in
  the body of a message to majord...@vger.kernel.org More majordomo
  info at http://vger.kernel.org/majordomo-info.html
 
 



Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread “tiejun.chen”

On 07/18/2013 02:04 PM, Bharat Bhushan wrote:

If there is a struct page for the requested mapping then it's
normal DDR and the mapping sets M bit (coherent, cacheable)
else this is treated as I/O and we set  I + G  (cache inhibited, guarded)

This helps setting proper TLB mapping for direct assigned device

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
  arch/powerpc/kvm/e500_mmu_host.c |   17 -
  1 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
index 1c6a9d7..089c227 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int 
usermode)
return mas3;
  }

-static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
+static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
  {
+   u32 mas2_attr;
+
+   mas2_attr = mas2  MAS2_ATTRIB_MASK;
+
+   if (!pfn_valid(pfn)) {
+   mas2_attr |= MAS2_I | MAS2_G;
+   } else {
  #ifdef CONFIG_SMP
-   return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
-#else
-   return mas2  MAS2_ATTRIB_MASK;
+   mas2_attr |= MAS2_M;
  #endif
+   }


Additionally, in UP case this little chunk of code is equivalent to

if (1) {
mas2_attr |= MAS2_I | MAS2_G;
} else {
}

So you'd better wrapper MAS2_m in advance like,

#ifdef CONFIG_SMP
#define M_IF_SMPMAS2_M
#else
#define M_IF_SMP0
#endif

Then
if (1)
mas2_attr |= MAS2_I | MAS2_G;
else
mas2_attr |= M_IF_SMP;

Tiejun


+   return mas2_attr;
  }

  /*
@@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe(
/* Force IPROT=0 for all guest mappings. */
stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID;
stlbe-mas2 = (gvaddr  MAS2_EPN) |
- e500_shadow_mas2_attrib(gtlbe-mas2, pr);
+ e500_shadow_mas2_attrib(gtlbe-mas2, pfn);
stlbe-mas7_3 = ((u64)pfn  PAGE_SHIFT) |
e500_shadow_mas3_attrib(gtlbe-mas7_3, pr);




--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Bhushan Bharat-R65777
 Sent: Thursday, July 18, 2013 1:53 PM
 To: '“tiejun.chen”'
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott-
 B07421
 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
 managed pages
 
 
 
  -Original Message-
  From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
  Sent: Thursday, July 18, 2013 1:52 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood
  Scott-
  B07421
  Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
  kernel managed pages
 
  On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote:
  
  
   -Original Message-
   From: kvm-ppc-ow...@vger.kernel.org
   [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen”
   Sent: Thursday, July 18, 2013 1:01 PM
   To: Bhushan Bharat-R65777
   Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de;
   Wood
   Scott-
   B07421
   Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
   kernel managed pages
  
   On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:
  
  
   -Original Message-
   From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
   Sent: Thursday, July 18, 2013 11:56 AM
   To: Bhushan Bharat-R65777
   Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de;
   Wood
   Scott- B07421; Bhushan Bharat-R65777
   Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only
   for kernel managed pages
  
   On 07/18/2013 02:04 PM, Bharat Bhushan wrote:
   If there is a struct page for the requested mapping then it's
   normal DDR and the mapping sets M bit (coherent, cacheable)
   else this is treated as I/O and we set  I + G  (cache
   inhibited,
   guarded)
  
   This helps setting proper TLB mapping for direct assigned device
  
   Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
   ---
   arch/powerpc/kvm/e500_mmu_host.c |   17 -
   1 files changed, 12 insertions(+), 5 deletions(-)
  
   diff --git a/arch/powerpc/kvm/e500_mmu_host.c
   b/arch/powerpc/kvm/e500_mmu_host.c
   index 1c6a9d7..089c227 100644
   --- a/arch/powerpc/kvm/e500_mmu_host.c
   +++ b/arch/powerpc/kvm/e500_mmu_host.c
   @@ -64,13 +64,20 @@ static inline u32
   e500_shadow_mas3_attrib(u32 mas3, int
   usermode)
 return mas3;
   }
  
   -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int
   usermode)
   +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
   {
   + u32 mas2_attr;
   +
   + mas2_attr = mas2  MAS2_ATTRIB_MASK;
   +
   + if (!pfn_valid(pfn)) {
  
   Why not directly use kvm_is_mmio_pfn()?
  
   What I understand from this function (someone can correct me) is
   that it
   returns false when the page is managed by kernel and is not
   marked as RESERVED (for some reason). For us it does not matter
   whether the page is reserved or not, if it is kernel visible page then it
 is DDR.
  
  
   I think you are setting I|G by addressing all mmio pages, right? If
   so,
  
 KVM: direct mmio pfn check
  
 Userspace may specify memory slots that are backed by mmio
   pages rather than
 normal RAM.  In some cases it is not enough to identify these
   mmio
  pages
 by pfn_valid().  This patch adds checking the PageReserved as well.
  
   Do you know what are those some cases and how checking
   PageReserved helps in
  those cases?
 
  No, myself didn't see these actual cases in qemu,too. But this should
  be chronically persistent as I understand ;-)
 
 Then I will wait till someone educate me :)

The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want 
to call this for all tlbwe operation unless it is necessary.

-Bharat

   + mas2_attr |= MAS2_I | MAS2_G;
   + } else {
   #ifdef CONFIG_SMP
   - return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
   -#else
   - return mas2  MAS2_ATTRIB_MASK;
   + mas2_attr |= MAS2_M;
   #endif
   + }
   + return mas2_attr;
   }
  
   /*
   @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe(
 /* Force IPROT=0 for all guest mappings. */
 stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | 
   MAS1_VALID;
 stlbe-mas2 = (gvaddr  MAS2_EPN) |
   -   e500_shadow_mas2_attrib(gtlbe-mas2, pr);
   +   e500_shadow_mas2_attrib(gtlbe-mas2, pfn);
 stlbe-mas7_3 = ((u64)pfn  PAGE_SHIFT) |
 e500_shadow_mas3_attrib(gtlbe-mas7_3, pr);
  
  
  
  
  
   --
   To unsubscribe from this list: send the line unsubscribe kvm-ppc
   in the body of a message to majord...@vger.kernel.org More
   majordomo info at http://vger.kernel.org/majordomo-info.html
  
 

N�r��yb�X��ǧv�^�)޺{.n�+jir)w*jg����ݢj/���z�ޖ��2�ޙ�)ߡ�a�����G���h��j:+v���w��٥

Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread “tiejun.chen”

On 07/18/2013 04:25 PM, Bhushan Bharat-R65777 wrote:




-Original Message-
From: Bhushan Bharat-R65777
Sent: Thursday, July 18, 2013 1:53 PM
To: '“tiejun.chen”'
Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott-
B07421
Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
managed pages




-Original Message-
From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
Sent: Thursday, July 18, 2013 1:52 PM
To: Bhushan Bharat-R65777
Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood
Scott-
B07421
Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
kernel managed pages

On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote:




-Original Message-
From: kvm-ppc-ow...@vger.kernel.org
[mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen”
Sent: Thursday, July 18, 2013 1:01 PM
To: Bhushan Bharat-R65777
Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de;
Wood
Scott-
B07421
Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
kernel managed pages

On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:




-Original Message-
From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
Sent: Thursday, July 18, 2013 11:56 AM
To: Bhushan Bharat-R65777
Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de;
Wood
Scott- B07421; Bhushan Bharat-R65777
Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only
for kernel managed pages

On 07/18/2013 02:04 PM, Bharat Bhushan wrote:

If there is a struct page for the requested mapping then it's
normal DDR and the mapping sets M bit (coherent, cacheable)
else this is treated as I/O and we set  I + G  (cache
inhibited,
guarded)

This helps setting proper TLB mapping for direct assigned device

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 arch/powerpc/kvm/e500_mmu_host.c |   17 -
 1 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/e500_mmu_host.c
b/arch/powerpc/kvm/e500_mmu_host.c
index 1c6a9d7..089c227 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -64,13 +64,20 @@ static inline u32
e500_shadow_mas3_attrib(u32 mas3, int

usermode)

return mas3;
 }

-static inline u32 e500_shadow_mas2_attrib(u32 mas2, int
usermode)
+static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
 {
+   u32 mas2_attr;
+
+   mas2_attr = mas2  MAS2_ATTRIB_MASK;
+
+   if (!pfn_valid(pfn)) {


Why not directly use kvm_is_mmio_pfn()?


What I understand from this function (someone can correct me) is
that it

returns false when the page is managed by kernel and is not
marked as RESERVED (for some reason). For us it does not matter
whether the page is reserved or not, if it is kernel visible page then it

is DDR.




I think you are setting I|G by addressing all mmio pages, right? If
so,

   KVM: direct mmio pfn check

   Userspace may specify memory slots that are backed by mmio
pages rather than
   normal RAM.  In some cases it is not enough to identify these
mmio

pages

   by pfn_valid().  This patch adds checking the PageReserved as well.


Do you know what are those some cases and how checking
PageReserved helps in

those cases?

No, myself didn't see these actual cases in qemu,too. But this should
be chronically persistent as I understand ;-)


Then I will wait till someone educate me :)


The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want 
to call this for all tlbwe operation unless it is necessary.


Furthermore, how to distinguish we're creating TLB entry for the device assigned 
directly to the GS?


I think its unnecessary to always check if that is mmio's pfn since we have more 
non direct assigned devices.


So maybe we can introduce another helper to fixup that TLB entry in instead of 
this path.


Tiejun



-Bharat


+   mas2_attr |= MAS2_I | MAS2_G;
+   } else {
 #ifdef CONFIG_SMP
-   return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
-#else
-   return mas2  MAS2_ATTRIB_MASK;
+   mas2_attr |= MAS2_M;
 #endif
+   }
+   return mas2_attr;
 }

 /*
@@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe(
/* Force IPROT=0 for all guest mappings. */
stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID;
stlbe-mas2 = (gvaddr  MAS2_EPN) |
- e500_shadow_mas2_attrib(gtlbe-mas2, pr);
+ e500_shadow_mas2_attrib(gtlbe-mas2, pfn);
stlbe-mas7_3 = ((u64)pfn  PAGE_SHIFT) |
e500_shadow_mas3_attrib(gtlbe-mas7_3, pr);








--
To unsubscribe from this list: send the line unsubscribe kvm-ppc
in the body of a message to majord...@vger.kernel.org More
majordomo info at http://vger.kernel.org/majordomo-info.html








--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to 

Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Alexander Graf

On 18.07.2013, at 10:55, “tiejun.chen” wrote:

 On 07/18/2013 04:25 PM, Bhushan Bharat-R65777 wrote:
 
 
 -Original Message-
 From: Bhushan Bharat-R65777
 Sent: Thursday, July 18, 2013 1:53 PM
 To: '“tiejun.chen”'
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood 
 Scott-
 B07421
 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
 managed pages
 
 
 
 -Original Message-
 From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
 Sent: Thursday, July 18, 2013 1:52 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood
 Scott-
 B07421
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
 kernel managed pages
 
 On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote:
 
 
 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org
 [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen”
 Sent: Thursday, July 18, 2013 1:01 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de;
 Wood
 Scott-
 B07421
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
 kernel managed pages
 
 On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:
 
 
 -Original Message-
 From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
 Sent: Thursday, July 18, 2013 11:56 AM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de;
 Wood
 Scott- B07421; Bhushan Bharat-R65777
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only
 for kernel managed pages
 
 On 07/18/2013 02:04 PM, Bharat Bhushan wrote:
 If there is a struct page for the requested mapping then it's
 normal DDR and the mapping sets M bit (coherent, cacheable)
 else this is treated as I/O and we set  I + G  (cache
 inhibited,
 guarded)
 
 This helps setting proper TLB mapping for direct assigned device
 
 Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
 ---
 arch/powerpc/kvm/e500_mmu_host.c |   17 -
 1 files changed, 12 insertions(+), 5 deletions(-)
 
 diff --git a/arch/powerpc/kvm/e500_mmu_host.c
 b/arch/powerpc/kvm/e500_mmu_host.c
 index 1c6a9d7..089c227 100644
 --- a/arch/powerpc/kvm/e500_mmu_host.c
 +++ b/arch/powerpc/kvm/e500_mmu_host.c
 @@ -64,13 +64,20 @@ static inline u32
 e500_shadow_mas3_attrib(u32 mas3, int
 usermode)
   return mas3;
 }
 
 -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int
 usermode)
 +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
 {
 + u32 mas2_attr;
 +
 + mas2_attr = mas2  MAS2_ATTRIB_MASK;
 +
 + if (!pfn_valid(pfn)) {
 
 Why not directly use kvm_is_mmio_pfn()?
 
 What I understand from this function (someone can correct me) is
 that it
 returns false when the page is managed by kernel and is not
 marked as RESERVED (for some reason). For us it does not matter
 whether the page is reserved or not, if it is kernel visible page then it
 is DDR.
 
 
 I think you are setting I|G by addressing all mmio pages, right? If
 so,
 
   KVM: direct mmio pfn check
 
   Userspace may specify memory slots that are backed by mmio
 pages rather than
   normal RAM.  In some cases it is not enough to identify these
 mmio
 pages
   by pfn_valid().  This patch adds checking the PageReserved as well.
 
 Do you know what are those some cases and how checking
 PageReserved helps in
 those cases?
 
 No, myself didn't see these actual cases in qemu,too. But this should
 be chronically persistent as I understand ;-)
 
 Then I will wait till someone educate me :)
 
 The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not 
 want to call this for all tlbwe operation unless it is necessary.
 
 Furthermore, how to distinguish we're creating TLB entry for the device 
 assigned directly to the GS?

Because other devices wouldn't be available to the guest through memory slots.

 I think its unnecessary to always check if that is mmio's pfn since we have 
 more non direct assigned devices.

I'm not sure I understand. The shadow TLB code only knows here is a host 
virtual address. It needs to figure out whether the host physical address 
behind that is RAM (can access with cache enabled) or not (has to disable cache)

 So maybe we can introduce another helper to fixup that TLB entry in instead 
 of this path.

This path does fix up the shadow (host) TLB entry :).


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Alexander Graf

On 18.07.2013, at 10:25, Bhushan Bharat-R65777 wrote:

 
 
 -Original Message-
 From: Bhushan Bharat-R65777
 Sent: Thursday, July 18, 2013 1:53 PM
 To: '“tiejun.chen”'
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott-
 B07421
 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
 managed pages
 
 
 
 -Original Message-
 From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
 Sent: Thursday, July 18, 2013 1:52 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood
 Scott-
 B07421
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
 kernel managed pages
 
 On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote:
 
 
 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org
 [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen”
 Sent: Thursday, July 18, 2013 1:01 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de;
 Wood
 Scott-
 B07421
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
 kernel managed pages
 
 On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:
 
 
 -Original Message-
 From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
 Sent: Thursday, July 18, 2013 11:56 AM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de;
 Wood
 Scott- B07421; Bhushan Bharat-R65777
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only
 for kernel managed pages
 
 On 07/18/2013 02:04 PM, Bharat Bhushan wrote:
 If there is a struct page for the requested mapping then it's
 normal DDR and the mapping sets M bit (coherent, cacheable)
 else this is treated as I/O and we set  I + G  (cache
 inhibited,
 guarded)
 
 This helps setting proper TLB mapping for direct assigned device
 
 Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
 ---
arch/powerpc/kvm/e500_mmu_host.c |   17 -
1 files changed, 12 insertions(+), 5 deletions(-)
 
 diff --git a/arch/powerpc/kvm/e500_mmu_host.c
 b/arch/powerpc/kvm/e500_mmu_host.c
 index 1c6a9d7..089c227 100644
 --- a/arch/powerpc/kvm/e500_mmu_host.c
 +++ b/arch/powerpc/kvm/e500_mmu_host.c
 @@ -64,13 +64,20 @@ static inline u32
 e500_shadow_mas3_attrib(u32 mas3, int
 usermode)
return mas3;
}
 
 -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int
 usermode)
 +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
{
 +  u32 mas2_attr;
 +
 +  mas2_attr = mas2  MAS2_ATTRIB_MASK;
 +
 +  if (!pfn_valid(pfn)) {
 
 Why not directly use kvm_is_mmio_pfn()?
 
 What I understand from this function (someone can correct me) is
 that it
 returns false when the page is managed by kernel and is not
 marked as RESERVED (for some reason). For us it does not matter
 whether the page is reserved or not, if it is kernel visible page then it
 is DDR.
 
 
 I think you are setting I|G by addressing all mmio pages, right? If
 so,
 
  KVM: direct mmio pfn check
 
  Userspace may specify memory slots that are backed by mmio
 pages rather than
  normal RAM.  In some cases it is not enough to identify these
 mmio
 pages
  by pfn_valid().  This patch adds checking the PageReserved as well.
 
 Do you know what are those some cases and how checking
 PageReserved helps in
 those cases?
 
 No, myself didn't see these actual cases in qemu,too. But this should
 be chronically persistent as I understand ;-)
 
 Then I will wait till someone educate me :)
 
 The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not 
 want to call this for all tlbwe operation unless it is necessary.

It certainly does more than we need and potentially slows down the fast path 
(RAM mapping). The only thing it does on top of if (pfn_valid()) is to check 
for pages that are declared reserved on the host. This happens in 2 cases:

  1) Non cache coherent DMA
  2) Memory hot remove

The non coherent DMA case would be interesting, as with the mechanism as it is 
in place in Linux today, we could potentially break normal guest operation if 
we don't take it into account. However, it's Kconfig guarded by:

depends on 4xx || 8xx || E200 || PPC_MPC512x || GAMECUBE_COMMON
default n if PPC_47x
default y

so we never hit it with any core we care about ;).

Memory hot remove does not exist on e500 FWIW, so we don't have to worry about 
that one either.

Which means I think it's fine to slim this whole thing down to only check for 
pfn_valid(), as the code does in this patch. It would however be very useful to 
have a comment there that explains why it's safe to do so.



Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Bhushan Bharat-R65777


 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On
 Behalf Of Alexander Graf
 Sent: Thursday, July 18, 2013 3:19 PM
 To: Bhushan Bharat-R65777
 Cc: “tiejun.chen”; kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood 
 Scott-
 B07421
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
 managed pages
 
 
 On 18.07.2013, at 10:25, Bhushan Bharat-R65777 wrote:
 
 
 
  -Original Message-
  From: Bhushan Bharat-R65777
  Sent: Thursday, July 18, 2013 1:53 PM
  To: '“tiejun.chen”'
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood
  Scott-
  B07421
  Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for
  kernel managed pages
 
 
 
  -Original Message-
  From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
  Sent: Thursday, July 18, 2013 1:52 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de;
  Wood
  Scott-
  B07421
  Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
  kernel managed pages
 
  On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote:
 
 
  -Original Message-
  From: kvm-ppc-ow...@vger.kernel.org
  [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen”
  Sent: Thursday, July 18, 2013 1:01 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de;
  Wood
  Scott-
  B07421
  Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only
  for kernel managed pages
 
  On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:
 
 
  -Original Message-
  From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
  Sent: Thursday, July 18, 2013 11:56 AM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de;
  Wood
  Scott- B07421; Bhushan Bharat-R65777
  Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only
  for kernel managed pages
 
  On 07/18/2013 02:04 PM, Bharat Bhushan wrote:
  If there is a struct page for the requested mapping then it's
  normal DDR and the mapping sets M bit (coherent, cacheable)
  else this is treated as I/O and we set  I + G  (cache
  inhibited,
  guarded)
 
  This helps setting proper TLB mapping for direct assigned
  device
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
 arch/powerpc/kvm/e500_mmu_host.c |   17 -
 1 files changed, 12 insertions(+), 5 deletions(-)
 
  diff --git a/arch/powerpc/kvm/e500_mmu_host.c
  b/arch/powerpc/kvm/e500_mmu_host.c
  index 1c6a9d7..089c227 100644
  --- a/arch/powerpc/kvm/e500_mmu_host.c
  +++ b/arch/powerpc/kvm/e500_mmu_host.c
  @@ -64,13 +64,20 @@ static inline u32
  e500_shadow_mas3_attrib(u32 mas3, int
  usermode)
   return mas3;
 }
 
  -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int
  usermode)
  +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
 {
  +u32 mas2_attr;
  +
  +mas2_attr = mas2  MAS2_ATTRIB_MASK;
  +
  +if (!pfn_valid(pfn)) {
 
  Why not directly use kvm_is_mmio_pfn()?
 
  What I understand from this function (someone can correct me) is
  that it
  returns false when the page is managed by kernel and is not
  marked as RESERVED (for some reason). For us it does not matter
  whether the page is reserved or not, if it is kernel visible page
  then it
  is DDR.
 
 
  I think you are setting I|G by addressing all mmio pages, right?
  If so,
 
   KVM: direct mmio pfn check
 
   Userspace may specify memory slots that are backed by mmio
  pages rather than
   normal RAM.  In some cases it is not enough to identify these
  mmio
  pages
   by pfn_valid().  This patch adds checking the PageReserved as well.
 
  Do you know what are those some cases and how checking
  PageReserved helps in
  those cases?
 
  No, myself didn't see these actual cases in qemu,too. But this
  should be chronically persistent as I understand ;-)
 
  Then I will wait till someone educate me :)
 
  The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not
 want to call this for all tlbwe operation unless it is necessary.
 
 It certainly does more than we need and potentially slows down the fast path
 (RAM mapping). The only thing it does on top of if (pfn_valid()) is to check
 for pages that are declared reserved on the host. This happens in 2 cases:
 
   1) Non cache coherent DMA
   2) Memory hot remove
 
 The non coherent DMA case would be interesting, as with the mechanism as it is
 in place in Linux today, we could potentially break normal guest operation if 
 we
 don't take it into account. However, it's Kconfig guarded by:
 
 depends on 4xx || 8xx || E200 || PPC_MPC512x || GAMECUBE_COMMON
 default n if PPC_47x
 default y
 
 so we never hit it with any core we care about ;).
 
 Memory hot remove does not exist on e500 FWIW, so we don't have to worry about
 that one either.
 
 Which means I think it's fine 

Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread “tiejun.chen”

On 07/18/2013 05:44 PM, Alexander Graf wrote:


On 18.07.2013, at 10:55, �tiejun.chen� wrote:


On 07/18/2013 04:25 PM, Bhushan Bharat-R65777 wrote:




-Original Message-
From: Bhushan Bharat-R65777
Sent: Thursday, July 18, 2013 1:53 PM
To: '�tiejun.chen�'
Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott-
B07421
Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
managed pages




-Original Message-
From: �tiejun.chen� [mailto:tiejun.c...@windriver.com]
Sent: Thursday, July 18, 2013 1:52 PM
To: Bhushan Bharat-R65777
Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood
Scott-
B07421
Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
kernel managed pages

On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote:




-Original Message-
From: kvm-ppc-ow...@vger.kernel.org
[mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of �tiejun.chen�
Sent: Thursday, July 18, 2013 1:01 PM
To: Bhushan Bharat-R65777
Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de;
Wood
Scott-
B07421
Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
kernel managed pages

On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:




-Original Message-
From: �tiejun.chen� [mailto:tiejun.c...@windriver.com]
Sent: Thursday, July 18, 2013 11:56 AM
To: Bhushan Bharat-R65777
Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de;
Wood
Scott- B07421; Bhushan Bharat-R65777
Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only
for kernel managed pages

On 07/18/2013 02:04 PM, Bharat Bhushan wrote:

If there is a struct page for the requested mapping then it's
normal DDR and the mapping sets M bit (coherent, cacheable)
else this is treated as I/O and we set  I + G  (cache
inhibited,
guarded)

This helps setting proper TLB mapping for direct assigned device

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 arch/powerpc/kvm/e500_mmu_host.c |   17 -
 1 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/e500_mmu_host.c
b/arch/powerpc/kvm/e500_mmu_host.c
index 1c6a9d7..089c227 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -64,13 +64,20 @@ static inline u32
e500_shadow_mas3_attrib(u32 mas3, int

usermode)

return mas3;
 }

-static inline u32 e500_shadow_mas2_attrib(u32 mas2, int
usermode)
+static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
 {
+   u32 mas2_attr;
+
+   mas2_attr = mas2  MAS2_ATTRIB_MASK;
+
+   if (!pfn_valid(pfn)) {


Why not directly use kvm_is_mmio_pfn()?


What I understand from this function (someone can correct me) is
that it

returns false when the page is managed by kernel and is not
marked as RESERVED (for some reason). For us it does not matter
whether the page is reserved or not, if it is kernel visible page then it

is DDR.




I think you are setting I|G by addressing all mmio pages, right? If
so,

   KVM: direct mmio pfn check

   Userspace may specify memory slots that are backed by mmio
pages rather than
   normal RAM.  In some cases it is not enough to identify these
mmio

pages

   by pfn_valid().  This patch adds checking the PageReserved as well.


Do you know what are those some cases and how checking
PageReserved helps in

those cases?

No, myself didn't see these actual cases in qemu,too. But this should
be chronically persistent as I understand ;-)


Then I will wait till someone educate me :)


The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want 
to call this for all tlbwe operation unless it is necessary.


Furthermore, how to distinguish we're creating TLB entry for the device 
assigned directly to the GS?


Because other devices wouldn't be available to the guest through memory slots.


Yes.




I think its unnecessary to always check if that is mmio's pfn since we have 
more non direct assigned devices.


I'm not sure I understand. The shadow TLB code only knows here is a host virtual 
address. It needs to figure out whether the host physical address behind that is 
RAM (can access with cache enabled) or not (has to disable cache)



Sorry, looks I'm misleading you :-P


So maybe we can introduce another helper to fixup that TLB entry in instead of 
this path.


This path does fix up the shadow (host) TLB entry :).



I just mean whether we can have a separate path dedicated to those direct 
assigned devices, not go this common path :)


Tiejun
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Alexander Graf

On 18.07.2013, at 11:56, “tiejun.chen” wrote:

 On 07/18/2013 05:44 PM, Alexander Graf wrote:
 
 On 18.07.2013, at 10:55, �tiejun.chen� wrote:
 
 On 07/18/2013 04:25 PM, Bhushan Bharat-R65777 wrote:
 
 
 -Original Message-
 From: Bhushan Bharat-R65777
 Sent: Thursday, July 18, 2013 1:53 PM
 To: '�tiejun.chen�'
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood 
 Scott-
 B07421
 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
 managed pages
 
 
 
 -Original Message-
 From: �tiejun.chen� [mailto:tiejun.c...@windriver.com]
 Sent: Thursday, July 18, 2013 1:52 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood
 Scott-
 B07421
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
 kernel managed pages
 
 On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote:
 
 
 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org
 [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of �tiejun.chen�
 Sent: Thursday, July 18, 2013 1:01 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de;
 Wood
 Scott-
 B07421
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
 kernel managed pages
 
 On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:
 
 
 -Original Message-
 From: �tiejun.chen� [mailto:tiejun.c...@windriver.com]
 Sent: Thursday, July 18, 2013 11:56 AM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de;
 Wood
 Scott- B07421; Bhushan Bharat-R65777
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only
 for kernel managed pages
 
 On 07/18/2013 02:04 PM, Bharat Bhushan wrote:
 If there is a struct page for the requested mapping then it's
 normal DDR and the mapping sets M bit (coherent, cacheable)
 else this is treated as I/O and we set  I + G  (cache
 inhibited,
 guarded)
 
 This helps setting proper TLB mapping for direct assigned device
 
 Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
 ---
 arch/powerpc/kvm/e500_mmu_host.c |   17 -
 1 files changed, 12 insertions(+), 5 deletions(-)
 
 diff --git a/arch/powerpc/kvm/e500_mmu_host.c
 b/arch/powerpc/kvm/e500_mmu_host.c
 index 1c6a9d7..089c227 100644
 --- a/arch/powerpc/kvm/e500_mmu_host.c
 +++ b/arch/powerpc/kvm/e500_mmu_host.c
 @@ -64,13 +64,20 @@ static inline u32
 e500_shadow_mas3_attrib(u32 mas3, int
 usermode)
 return mas3;
 }
 
 -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int
 usermode)
 +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
 {
 +   u32 mas2_attr;
 +
 +   mas2_attr = mas2  MAS2_ATTRIB_MASK;
 +
 +   if (!pfn_valid(pfn)) {
 
 Why not directly use kvm_is_mmio_pfn()?
 
 What I understand from this function (someone can correct me) is
 that it
 returns false when the page is managed by kernel and is not
 marked as RESERVED (for some reason). For us it does not matter
 whether the page is reserved or not, if it is kernel visible page then 
 it
 is DDR.
 
 
 I think you are setting I|G by addressing all mmio pages, right? If
 so,
 
   KVM: direct mmio pfn check
 
   Userspace may specify memory slots that are backed by mmio
 pages rather than
   normal RAM.  In some cases it is not enough to identify these
 mmio
 pages
   by pfn_valid().  This patch adds checking the PageReserved as 
 well.
 
 Do you know what are those some cases and how checking
 PageReserved helps in
 those cases?
 
 No, myself didn't see these actual cases in qemu,too. But this should
 be chronically persistent as I understand ;-)
 
 Then I will wait till someone educate me :)
 
 The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not 
 want to call this for all tlbwe operation unless it is necessary.
 
 Furthermore, how to distinguish we're creating TLB entry for the device 
 assigned directly to the GS?
 
 Because other devices wouldn't be available to the guest through memory 
 slots.
 
 Yes.
 
 
 I think its unnecessary to always check if that is mmio's pfn since we have 
 more non direct assigned devices.
 
 I'm not sure I understand. The shadow TLB code only knows here is a host 
 virtual address. It needs to figure out whether the host physical address 
 behind that is RAM (can access with cache enabled) or not (has to disable 
 cache)
 
 
 Sorry, looks I'm misleading you :-P
 
 So maybe we can introduce another helper to fixup that TLB entry in instead 
 of this path.
 
 This path does fix up the shadow (host) TLB entry :).
 
 
 I just mean whether we can have a separate path dedicated to those direct 
 assigned devices, not go this common path :)

I don't think it's possible to have a separate path without a certain level of 
trust. In the current flow we don't trust anyone. We just check every 
translated page whether we should enable caching or not.

We could take that information from 2 other side though:

  1) Memory 

Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Alexander Graf

On 18.07.2013, at 12:08, “tiejun.chen” wrote:

 On 07/18/2013 05:48 PM, Alexander Graf wrote:
 
 On 18.07.2013, at 10:25, Bhushan Bharat-R65777 wrote:
 
 
 
 -Original Message-
 From: Bhushan Bharat-R65777
 Sent: Thursday, July 18, 2013 1:53 PM
 To: '�tiejun.chen�'
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood 
 Scott-
 B07421
 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
 managed pages
 
 
 
 -Original Message-
 From: �tiejun.chen� [mailto:tiejun.c...@windriver.com]
 Sent: Thursday, July 18, 2013 1:52 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood
 Scott-
 B07421
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
 kernel managed pages
 
 On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote:
 
 
 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org
 [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of �tiejun.chen�
 Sent: Thursday, July 18, 2013 1:01 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de;
 Wood
 Scott-
 B07421
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
 kernel managed pages
 
 On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:
 
 
 -Original Message-
 From: �tiejun.chen� [mailto:tiejun.c...@windriver.com]
 Sent: Thursday, July 18, 2013 11:56 AM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de;
 Wood
 Scott- B07421; Bhushan Bharat-R65777
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only
 for kernel managed pages
 
 On 07/18/2013 02:04 PM, Bharat Bhushan wrote:
 If there is a struct page for the requested mapping then it's
 normal DDR and the mapping sets M bit (coherent, cacheable)
 else this is treated as I/O and we set  I + G  (cache
 inhibited,
 guarded)
 
 This helps setting proper TLB mapping for direct assigned device
 
 Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
 ---
arch/powerpc/kvm/e500_mmu_host.c |   17 -
1 files changed, 12 insertions(+), 5 deletions(-)
 
 diff --git a/arch/powerpc/kvm/e500_mmu_host.c
 b/arch/powerpc/kvm/e500_mmu_host.c
 index 1c6a9d7..089c227 100644
 --- a/arch/powerpc/kvm/e500_mmu_host.c
 +++ b/arch/powerpc/kvm/e500_mmu_host.c
 @@ -64,13 +64,20 @@ static inline u32
 e500_shadow_mas3_attrib(u32 mas3, int
 usermode)
  return mas3;
}
 
 -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int
 usermode)
 +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
{
 +u32 mas2_attr;
 +
 +mas2_attr = mas2  MAS2_ATTRIB_MASK;
 +
 +if (!pfn_valid(pfn)) {
 
 Why not directly use kvm_is_mmio_pfn()?
 
 What I understand from this function (someone can correct me) is
 that it
 returns false when the page is managed by kernel and is not
 marked as RESERVED (for some reason). For us it does not matter
 whether the page is reserved or not, if it is kernel visible page then 
 it
 is DDR.
 
 
 I think you are setting I|G by addressing all mmio pages, right? If
 so,
 
  KVM: direct mmio pfn check
 
  Userspace may specify memory slots that are backed by mmio
 pages rather than
  normal RAM.  In some cases it is not enough to identify these
 mmio
 pages
  by pfn_valid().  This patch adds checking the PageReserved as well.
 
 Do you know what are those some cases and how checking
 PageReserved helps in
 those cases?
 
 No, myself didn't see these actual cases in qemu,too. But this should
 be chronically persistent as I understand ;-)
 
 Then I will wait till someone educate me :)
 
 The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not 
 want to call this for all tlbwe operation unless it is necessary.
 
 It certainly does more than we need and potentially slows down the fast path 
 (RAM mapping). The only thing it does on top of if (pfn_valid()) is to 
 check for pages that are declared reserved on the host. This happens in 2 
 cases:
 
   1) Non cache coherent DMA
   2) Memory hot remove
 
 The non coherent DMA case would be interesting, as with the mechanism as it 
 is in place in Linux today, we could potentially break normal guest 
 operation if we don't take it into account. However, it's Kconfig guarded by:
 
 depends on 4xx || 8xx || E200 || PPC_MPC512x || GAMECUBE_COMMON
 default n if PPC_47x
 default y
 
 so we never hit it with any core we care about ;).
 
 Memory hot remove does not exist on e500 FWIW, so we don't have to worry 
 about that one either.
 
 Thanks for this good information :)
 
 So why not limit those codes with CONFIG_MEMORY_HOTPLUG inside 
 kvm_is_mmio_pfn() to make sure that check is only valid when that is really 
 needed? This can decrease those unnecessary performance loss.
 
 If I'm wrong please correct me :)

You're perfectly right, but this is generic KVM code. So it gets run across all 
architectures. What if someone has the great 

Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread “tiejun.chen”

On 07/18/2013 06:00 PM, Alexander Graf wrote:


On 18.07.2013, at 11:56, “tiejun.chen” wrote:


On 07/18/2013 05:44 PM, Alexander Graf wrote:


On 18.07.2013, at 10:55, �tiejun.chen� wrote:


On 07/18/2013 04:25 PM, Bhushan Bharat-R65777 wrote:




-Original Message-
From: Bhushan Bharat-R65777
Sent: Thursday, July 18, 2013 1:53 PM
To: '�tiejun.chen�'
Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott-
B07421
Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
managed pages




-Original Message-
From: �tiejun.chen� [mailto:tiejun.c...@windriver.com]
Sent: Thursday, July 18, 2013 1:52 PM
To: Bhushan Bharat-R65777
Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood
Scott-
B07421
Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
kernel managed pages

On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote:




-Original Message-
From: kvm-ppc-ow...@vger.kernel.org
[mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of �tiejun.chen�
Sent: Thursday, July 18, 2013 1:01 PM
To: Bhushan Bharat-R65777
Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de;
Wood
Scott-
B07421
Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
kernel managed pages

On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:




-Original Message-
From: �tiejun.chen� [mailto:tiejun.c...@windriver.com]
Sent: Thursday, July 18, 2013 11:56 AM
To: Bhushan Bharat-R65777
Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de;
Wood
Scott- B07421; Bhushan Bharat-R65777
Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only
for kernel managed pages

On 07/18/2013 02:04 PM, Bharat Bhushan wrote:

If there is a struct page for the requested mapping then it's
normal DDR and the mapping sets M bit (coherent, cacheable)
else this is treated as I/O and we set  I + G  (cache
inhibited,
guarded)

This helps setting proper TLB mapping for direct assigned device

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 arch/powerpc/kvm/e500_mmu_host.c |   17 -
 1 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/e500_mmu_host.c
b/arch/powerpc/kvm/e500_mmu_host.c
index 1c6a9d7..089c227 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -64,13 +64,20 @@ static inline u32
e500_shadow_mas3_attrib(u32 mas3, int

usermode)

return mas3;
 }

-static inline u32 e500_shadow_mas2_attrib(u32 mas2, int
usermode)
+static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
 {
+   u32 mas2_attr;
+
+   mas2_attr = mas2  MAS2_ATTRIB_MASK;
+
+   if (!pfn_valid(pfn)) {


Why not directly use kvm_is_mmio_pfn()?


What I understand from this function (someone can correct me) is
that it

returns false when the page is managed by kernel and is not
marked as RESERVED (for some reason). For us it does not matter
whether the page is reserved or not, if it is kernel visible page then it

is DDR.




I think you are setting I|G by addressing all mmio pages, right? If
so,

   KVM: direct mmio pfn check

   Userspace may specify memory slots that are backed by mmio
pages rather than
   normal RAM.  In some cases it is not enough to identify these
mmio

pages

   by pfn_valid().  This patch adds checking the PageReserved as well.


Do you know what are those some cases and how checking
PageReserved helps in

those cases?

No, myself didn't see these actual cases in qemu,too. But this should
be chronically persistent as I understand ;-)


Then I will wait till someone educate me :)


The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want 
to call this for all tlbwe operation unless it is necessary.


Furthermore, how to distinguish we're creating TLB entry for the device 
assigned directly to the GS?


Because other devices wouldn't be available to the guest through memory slots.


Yes.




I think its unnecessary to always check if that is mmio's pfn since we have 
more non direct assigned devices.


I'm not sure I understand. The shadow TLB code only knows here is a host virtual 
address. It needs to figure out whether the host physical address behind that is 
RAM (can access with cache enabled) or not (has to disable cache)



Sorry, looks I'm misleading you :-P


So maybe we can introduce another helper to fixup that TLB entry in instead of 
this path.


This path does fix up the shadow (host) TLB entry :).



I just mean whether we can have a separate path dedicated to those direct 
assigned devices, not go this common path :)


I don't think it's possible to have a separate path without a certain level of 
trust. In the current flow we don't trust anyone. We just check every 
translated page whether we should enable caching or not.

We could take that information from 2 other side though:

   1) Memory Slot
   2) Guest TLB Flags

If we take it from the memory 

Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread “tiejun.chen”

On 07/18/2013 06:12 PM, Alexander Graf wrote:


On 18.07.2013, at 12:08, “tiejun.chen” wrote:


On 07/18/2013 05:48 PM, Alexander Graf wrote:


On 18.07.2013, at 10:25, Bhushan Bharat-R65777 wrote:





-Original Message-
From: Bhushan Bharat-R65777
Sent: Thursday, July 18, 2013 1:53 PM
To: '�tiejun.chen�'
Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott-
B07421
Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
managed pages




-Original Message-
From: �tiejun.chen� [mailto:tiejun.c...@windriver.com]
Sent: Thursday, July 18, 2013 1:52 PM
To: Bhushan Bharat-R65777
Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood
Scott-
B07421
Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
kernel managed pages

On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote:




-Original Message-
From: kvm-ppc-ow...@vger.kernel.org
[mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of �tiejun.chen�
Sent: Thursday, July 18, 2013 1:01 PM
To: Bhushan Bharat-R65777
Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de;
Wood
Scott-
B07421
Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
kernel managed pages

On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:




-Original Message-
From: �tiejun.chen� [mailto:tiejun.c...@windriver.com]
Sent: Thursday, July 18, 2013 11:56 AM
To: Bhushan Bharat-R65777
Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de;
Wood
Scott- B07421; Bhushan Bharat-R65777
Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only
for kernel managed pages

On 07/18/2013 02:04 PM, Bharat Bhushan wrote:

If there is a struct page for the requested mapping then it's
normal DDR and the mapping sets M bit (coherent, cacheable)
else this is treated as I/O and we set  I + G  (cache
inhibited,
guarded)

This helps setting proper TLB mapping for direct assigned device

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
arch/powerpc/kvm/e500_mmu_host.c |   17 -
1 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/e500_mmu_host.c
b/arch/powerpc/kvm/e500_mmu_host.c
index 1c6a9d7..089c227 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -64,13 +64,20 @@ static inline u32
e500_shadow_mas3_attrib(u32 mas3, int

usermode)

return mas3;
}

-static inline u32 e500_shadow_mas2_attrib(u32 mas2, int
usermode)
+static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
{
+   u32 mas2_attr;
+
+   mas2_attr = mas2  MAS2_ATTRIB_MASK;
+
+   if (!pfn_valid(pfn)) {


Why not directly use kvm_is_mmio_pfn()?


What I understand from this function (someone can correct me) is
that it

returns false when the page is managed by kernel and is not
marked as RESERVED (for some reason). For us it does not matter
whether the page is reserved or not, if it is kernel visible page then it

is DDR.




I think you are setting I|G by addressing all mmio pages, right? If
so,

  KVM: direct mmio pfn check

  Userspace may specify memory slots that are backed by mmio
pages rather than
  normal RAM.  In some cases it is not enough to identify these
mmio

pages

  by pfn_valid().  This patch adds checking the PageReserved as well.


Do you know what are those some cases and how checking
PageReserved helps in

those cases?

No, myself didn't see these actual cases in qemu,too. But this should
be chronically persistent as I understand ;-)


Then I will wait till someone educate me :)


The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want 
to call this for all tlbwe operation unless it is necessary.


It certainly does more than we need and potentially slows down the fast path (RAM 
mapping). The only thing it does on top of if (pfn_valid()) is to check for 
pages that are declared reserved on the host. This happens in 2 cases:

   1) Non cache coherent DMA
   2) Memory hot remove

The non coherent DMA case would be interesting, as with the mechanism as it is 
in place in Linux today, we could potentially break normal guest operation if 
we don't take it into account. However, it's Kconfig guarded by:

 depends on 4xx || 8xx || E200 || PPC_MPC512x || GAMECUBE_COMMON
 default n if PPC_47x
 default y

so we never hit it with any core we care about ;).

Memory hot remove does not exist on e500 FWIW, so we don't have to worry about 
that one either.


Thanks for this good information :)

So why not limit those codes with CONFIG_MEMORY_HOTPLUG inside 
kvm_is_mmio_pfn() to make sure that check is only valid when that is really 
needed? This can decrease those unnecessary performance loss.

If I'm wrong please correct me :)


You're perfectly right, but this is generic KVM code. So it gets run across all 
architectures. What if someone has the great idea to add a new case here for 
x86, but doesn't tell 

[PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2

2013-07-18 Thread Bharat Bhushan
Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v2:
 - No change 
 arch/powerpc/kvm/e500.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h
index c2e5e98..277cb18 100644
--- a/arch/powerpc/kvm/e500.h
+++ b/arch/powerpc/kvm/e500.h
@@ -117,7 +117,7 @@ static inline struct kvmppc_vcpu_e500 *to_e500(struct 
kvm_vcpu *vcpu)
 #define E500_TLB_USER_PERM_MASK (MAS3_UX|MAS3_UR|MAS3_UW)
 #define E500_TLB_SUPER_PERM_MASK (MAS3_SX|MAS3_SR|MAS3_SW)
 #define MAS2_ATTRIB_MASK \
- (MAS2_X0 | MAS2_X1)
+ (MAS2_X0 | MAS2_X1 | MAS2_E)
 #define MAS3_ATTRIB_MASK \
  (MAS3_U0 | MAS3_U1 | MAS3_U2 | MAS3_U3 \
   | E500_TLB_USER_PERM_MASK | E500_TLB_SUPER_PERM_MASK)
-- 
1.7.0.4


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Bharat Bhushan
If there is a struct page for the requested mapping then it's
normal RAM and the mapping is set to M bit (coherent, cacheable)
otherwise this is treated as I/O and we set  I + G  (cache inhibited, guarded)

This helps setting proper TLB mapping for direct assigned device

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v2: some cleanup and added comment
 - 
 arch/powerpc/kvm/e500_mmu_host.c |   23 ++-
 1 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
index 1c6a9d7..02eb973 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -64,13 +64,26 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int 
usermode)
return mas3;
 }
 
-static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
+static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
 {
+   u32 mas2_attr;
+
+   mas2_attr = mas2  MAS2_ATTRIB_MASK;
+
+   /*
+* RAM is always mappable on e500 systems, so this is identical
+* to kvm_is_mmio_pfn(), just without its overhead.
+*/
+   if (!pfn_valid(pfn)) {
+   /* Pages not managed by Kernel are treated as I/O, set I + G */
+   mas2_attr |= MAS2_I | MAS2_G;
 #ifdef CONFIG_SMP
-   return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
-#else
-   return mas2  MAS2_ATTRIB_MASK;
+   } else {
+   /* Kernel managed pages are actually RAM so set  M */
+   mas2_attr |= MAS2_M;
 #endif
+   }
+   return mas2_attr;
 }
 
 /*
@@ -313,7 +326,7 @@ static void kvmppc_e500_setup_stlbe(
/* Force IPROT=0 for all guest mappings. */
stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID;
stlbe-mas2 = (gvaddr  MAS2_EPN) |
- e500_shadow_mas2_attrib(gtlbe-mas2, pr);
+ e500_shadow_mas2_attrib(gtlbe-mas2, pfn);
stlbe-mas7_3 = ((u64)pfn  PAGE_SHIFT) |
e500_shadow_mas3_attrib(gtlbe-mas7_3, pr);
 
-- 
1.7.0.4


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2

2013-07-18 Thread Alexander Graf

This needs a description. Why shouldn't we ignore E?


Alex

On 18.07.2013, at 15:19, Bharat Bhushan wrote:

 Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
 ---
 v2:
 - No change 
 arch/powerpc/kvm/e500.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)
 
 diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h
 index c2e5e98..277cb18 100644
 --- a/arch/powerpc/kvm/e500.h
 +++ b/arch/powerpc/kvm/e500.h
 @@ -117,7 +117,7 @@ static inline struct kvmppc_vcpu_e500 *to_e500(struct 
 kvm_vcpu *vcpu)
 #define E500_TLB_USER_PERM_MASK (MAS3_UX|MAS3_UR|MAS3_UW)
 #define E500_TLB_SUPER_PERM_MASK (MAS3_SX|MAS3_SR|MAS3_SW)
 #define MAS2_ATTRIB_MASK \
 -   (MAS2_X0 | MAS2_X1)
 +   (MAS2_X0 | MAS2_X1 | MAS2_E)
 #define MAS3_ATTRIB_MASK \
 (MAS3_U0 | MAS3_U1 | MAS3_U2 | MAS3_U3 \
  | E500_TLB_USER_PERM_MASK | E500_TLB_SUPER_PERM_MASK)
 -- 
 1.7.0.4
 
 

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Alexander Graf

On 18.07.2013, at 15:19, Bharat Bhushan wrote:

 If there is a struct page for the requested mapping then it's
 normal RAM and the mapping is set to M bit (coherent, cacheable)
 otherwise this is treated as I/O and we set  I + G  (cache inhibited, 
 guarded)
 
 This helps setting proper TLB mapping for direct assigned device
 
 Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
 ---
 v2: some cleanup and added comment
 - 
 arch/powerpc/kvm/e500_mmu_host.c |   23 ++-
 1 files changed, 18 insertions(+), 5 deletions(-)
 
 diff --git a/arch/powerpc/kvm/e500_mmu_host.c 
 b/arch/powerpc/kvm/e500_mmu_host.c
 index 1c6a9d7..02eb973 100644
 --- a/arch/powerpc/kvm/e500_mmu_host.c
 +++ b/arch/powerpc/kvm/e500_mmu_host.c
 @@ -64,13 +64,26 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int 
 usermode)
   return mas3;
 }
 
 -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
 +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
 {
 + u32 mas2_attr;
 +
 + mas2_attr = mas2  MAS2_ATTRIB_MASK;
 +
 + /*
 +  * RAM is always mappable on e500 systems, so this is identical
 +  * to kvm_is_mmio_pfn(), just without its overhead.
 +  */
 + if (!pfn_valid(pfn)) {
 + /* Pages not managed by Kernel are treated as I/O, set I + G */

Please also document the intermediate thought that I/O should be mapped 
non-cached.

 + mas2_attr |= MAS2_I | MAS2_G;
 #ifdef CONFIG_SMP

Please separate the SMP case out of the branch.

 - return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
 -#else
 - return mas2  MAS2_ATTRIB_MASK;
 + } else {
 + /* Kernel managed pages are actually RAM so set  M */

This comment doesn't tell me why M can be set ;).


Alex

 + mas2_attr |= MAS2_M;
 #endif
 + }
 + return mas2_attr;
 }
 
 /*
 @@ -313,7 +326,7 @@ static void kvmppc_e500_setup_stlbe(
   /* Force IPROT=0 for all guest mappings. */
   stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID;
   stlbe-mas2 = (gvaddr  MAS2_EPN) |
 -   e500_shadow_mas2_attrib(gtlbe-mas2, pr);
 +   e500_shadow_mas2_attrib(gtlbe-mas2, pfn);
   stlbe-mas7_3 = ((u64)pfn  PAGE_SHIFT) |
   e500_shadow_mas3_attrib(gtlbe-mas7_3, pr);
 
 -- 
 1.7.0.4
 
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm-ppc in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Alexander Graf

On 18.07.2013, at 17:15, Bhushan Bharat-R65777 wrote:

 
 
 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On
 Behalf Of Alexander Graf
 Sent: Thursday, July 18, 2013 8:23 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421; Bhushan
 Bharat-R65777
 Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel
 managed pages
 
 
 On 18.07.2013, at 15:19, Bharat Bhushan wrote:
 
 If there is a struct page for the requested mapping then it's normal
 RAM and the mapping is set to M bit (coherent, cacheable) otherwise
 this is treated as I/O and we set  I + G  (cache inhibited, guarded)
 
 This helps setting proper TLB mapping for direct assigned device
 
 Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
 ---
 v2: some cleanup and added comment
 -
 arch/powerpc/kvm/e500_mmu_host.c |   23 ++-
 1 files changed, 18 insertions(+), 5 deletions(-)
 
 diff --git a/arch/powerpc/kvm/e500_mmu_host.c
 b/arch/powerpc/kvm/e500_mmu_host.c
 index 1c6a9d7..02eb973 100644
 --- a/arch/powerpc/kvm/e500_mmu_host.c
 +++ b/arch/powerpc/kvm/e500_mmu_host.c
 @@ -64,13 +64,26 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int
 usermode)
 return mas3;
 }
 
 -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
 +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
 {
 +   u32 mas2_attr;
 +
 +   mas2_attr = mas2  MAS2_ATTRIB_MASK;
 +
 +   /*
 +* RAM is always mappable on e500 systems, so this is identical
 +* to kvm_is_mmio_pfn(), just without its overhead.
 +*/
 +   if (!pfn_valid(pfn)) {
 +   /* Pages not managed by Kernel are treated as I/O, set I + G */
 
 Please also document the intermediate thought that I/O should be mapped non-
 cached.
 
 I did not get what you mean to document?

Page snot managed by the kernel are treated as I/O. Map it
with disabled cache.

 
 
 +   mas2_attr |= MAS2_I | MAS2_G;
 #ifdef CONFIG_SMP
 
 Please separate the SMP case out of the branch.
 
 Really :) this was looking simple to me.

Two branches intertwined never look simple :).

 
 
 -   return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
 -#else
 -   return mas2  MAS2_ATTRIB_MASK;
 +   } else {
 +   /* Kernel managed pages are actually RAM so set  M */
 
 This comment doesn't tell me why M can be set ;).
 
 RAM in SMP, so setting coherent, is not that obvious?

No.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2

2013-07-18 Thread Alexander Graf

On 18.07.2013, at 17:12, Bhushan Bharat-R65777 wrote:

 
 
 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On
 Behalf Of Alexander Graf
 Sent: Thursday, July 18, 2013 8:18 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421; Bhushan
 Bharat-R65777
 Subject: Re: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2
 
 
 This needs a description. Why shouldn't we ignore E?
 
 What I understood is that there is no reason to stop guest setting E, so 
 allow him.

Please add that to the patch description. Also explain what the bit means.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2

2013-07-18 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Thursday, July 18, 2013 8:50 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421
 Subject: Re: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2
 
 
 On 18.07.2013, at 17:12, Bhushan Bharat-R65777 wrote:
 
 
 
  -Original Message-
  From: kvm-ppc-ow...@vger.kernel.org
  [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf
  Sent: Thursday, July 18, 2013 8:18 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421;
  Bhushan
  Bharat-R65777
  Subject: Re: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute
  in mas2
 
 
  This needs a description. Why shouldn't we ignore E?
 
  What I understood is that there is no reason to stop guest setting E, so
 allow him.
 
 Please add that to the patch description. Also explain what the bit means.

Ok :)

-Bharat
 
 
 Alex
 


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Scott Wood

On 07/18/2013 05:00:42 AM, Alexander Graf wrote:
Now why is setting invalid flags a problem? If I understand Scott  
correctly, it can break the host if you access certain host devices  
with caching enabled. But to be sure I'd say we ask him directly :).


The architecture makes it illegal to mix cacheable and cache-inhibited  
mappings to the same physical page.  Mixing W or M bits is generally  
bad as well.  I've seen it cause machine checks, error interrupts, etc.  
-- not just corrupting the page in question.


-Scott
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2

2013-07-18 Thread Scott Wood

On 07/18/2013 10:12:30 AM, Bhushan Bharat-R65777 wrote:



 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org  
[mailto:kvm-ppc-ow...@vger.kernel.org] On

 Behalf Of Alexander Graf
 Sent: Thursday, July 18, 2013 8:18 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood  
Scott-B07421; Bhushan

 Bharat-R65777
 Subject: Re: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E  
attribute in mas2



 This needs a description. Why shouldn't we ignore E?

What I understood is that there is no reason to stop guest setting  
E, so allow him.


FWIW, it'd probably be safe to let the guest control the G bit as well.

-Scott
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Alexander Graf

On 18.07.2013, at 19:17, Scott Wood wrote:

 On 07/18/2013 08:19:03 AM, Bharat Bhushan wrote:
 If there is a struct page for the requested mapping then it's
 normal RAM and the mapping is set to M bit (coherent, cacheable)
 otherwise this is treated as I/O and we set  I + G  (cache inhibited, 
 guarded)
 This helps setting proper TLB mapping for direct assigned device
 Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
 ---
 v2: some cleanup and added comment
 -
 arch/powerpc/kvm/e500_mmu_host.c |   23 ++-
 1 files changed, 18 insertions(+), 5 deletions(-)
 diff --git a/arch/powerpc/kvm/e500_mmu_host.c 
 b/arch/powerpc/kvm/e500_mmu_host.c
 index 1c6a9d7..02eb973 100644
 --- a/arch/powerpc/kvm/e500_mmu_host.c
 +++ b/arch/powerpc/kvm/e500_mmu_host.c
 @@ -64,13 +64,26 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int 
 usermode)
  return mas3;
 }
 -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
 +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
 {
 +u32 mas2_attr;
 +
 +mas2_attr = mas2  MAS2_ATTRIB_MASK;
 +
 +/*
 + * RAM is always mappable on e500 systems, so this is identical
 + * to kvm_is_mmio_pfn(), just without its overhead.
 + */
 +if (!pfn_valid(pfn)) {
 
 Please use page_is_ram(), which is what gets used when setting the WIMG for 
 the host userspace mapping.  We want to make sure the two are consistent.
 
 +/* Pages not managed by Kernel are treated as I/O, set I + G */
 +mas2_attr |= MAS2_I | MAS2_G;
 #ifdef CONFIG_SMP
 -return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
 -#else
 -return mas2  MAS2_ATTRIB_MASK;
 +} else {
 +/* Kernel managed pages are actually RAM so set  M */
 +mas2_attr |= MAS2_M;
 #endif
 
 Likewise, we want to make sure this matches the host entry.  Unfortunately, 
 this is a bit of a mess already.  64-bit booke appears to always set MAS2_M 
 for TLB0 mappings.  The initial KERNELBASE mapping on boot uses M_IF_SMP, and 
 the settlbcam() that (IIRC) replaces it uses _PAGE_COHERENT.  32-bit always 
 uses _PAGE_COHERENT, except that initial KERNELBASE mapping.  _PAGE_COHERENT 
 appears to be set based on CONFIG_SMP || CONFIG_PPC_STD_MMU (the latter 
 config clears _PAGE_COHERENT in the non-CPU_FTR_NEED_COHERENT case).
 
 As for what we actually want to happen, there are cases when we want M to be 
 set for non-SMP.  One such case is AMP, where CPUs may be sharing memory even 
 if the Linux instance only runs on one CPU (this is not hypothetical, BTW).  
 It's also possible that we encounter a hardware bug that requires MAS2_M, 
 similar to what some of our non-booke chips require.

How about we always set M then for RAM?


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/10 v6] KVM: PPC: IOMMU in-kernel handling

2013-07-18 Thread Alexey Kardashevskiy
On 07/16/2013 10:53 AM, Alexey Kardashevskiy wrote:
 The changes are:
 1. rebased on v3.11-rc1 so the capability numbers changed again
 2. fixed multiple comments from maintainers
 3. KVM: PPC: Add support for IOMMU in-kernel handling is split into
 2 patches, the new one is powerpc/iommu: rework to support realmode.
 4. IOMMU_API is now always enabled for KVM_BOOK3S_64.
 
 MOre details in the individual patch comments.
 
 Depends on hashtable: add hash_for_each_possible_rcu_notrace(),
 posted a while ago.
 
 
 Alexey Kardashevskiy (10):
   KVM: PPC: reserve a capability number for multitce support
   KVM: PPC: reserve a capability and ioctl numbers for realmode VFIO



Alex, could you please pull these 2 patches or tell what is wrong with
them? Having them sooner in the kernel would let me ask for a headers
update for QEMU and then I would try pushing miltiple TCE and VFIO support
in QEMU. Thanks.




-- 
Alexey
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html