RE: [PATCH 2/2] KVM : powerpc/booke: Allow debug interrupt injection to guest

2014-07-02 Thread bharat.bhus...@freescale.com

 -Original Message-
 From: Wood Scott-B07421
 Sent: Tuesday, July 01, 2014 10:11 PM
 To: Alexander Graf
 Cc: Bhushan Bharat-R65777; kvm-ppc@vger.kernel.org; k...@vger.kernel.org
 Subject: Re: [PATCH 2/2] KVM : powerpc/booke: Allow debug interrupt injection 
 to
 guest
 
 On Tue, 2014-07-01 at 18:22 +0200, Alexander Graf wrote:
  On 01.07.14 17:35, Scott Wood wrote:
   On Tue, 2014-07-01 at 17:04 +0200, Alexander Graf wrote:
   On 01.07.14 16:58, Scott Wood wrote:
   On Tue, 2014-07-01 at 08:23 +0200, Alexander Graf wrote:
   I don't think QEMU should be aware of these limitations.
   OK, but we should at least have some idea of how the whole thing
   is supposed to work, in order to determine if this is the correct
   behavior for QEMU.  I thought the model was that debug resources
   are either owned by QEMU or by the guest, and in the latter case,
   QEMU would never see the debug exception to begin with.
   That's bad for a number of reasons. For starters it's different
   from how
   x86 handles debug registers - and I hate to be different just for
   the sake of being different.
   How does it work on x86?
 
  It overwrites more-or-less random breakpoints with its own ones, but
  leaves the others intact ;).
 
 Are you talking about software breakpoints or management of hardware debug
 registers?
 
   So if we do want to declare that debug registers are owned by
   either QEMU or the guest, we should change the semantics for all
   architectures.
   If we want to say that ownership of the registers is shared, we need
   a plan for how that would actually work.
 
  I think you're overengineering here :). When do people actually use
  gdbstub? Usually when they want to debug a broken guest. We can either
 
 * overengineer heavily and reduce the number of registers available
  to the guest to always have spares
 * overengineer a bit and turn off guest debugging completely when
  we use gdbstub
 * just leave as much alive as we can, hoping that it helps with the
  debugging
 
  Option 3 is what x86 does - and I think it's a reasonable approach.
  This is not an interface that needs to be 100% consistent and bullet
  proof, it's a best effort to enable you to debug as much as possible.
 
 I'm not insisting on 100% -- just hoping for some explanation/discussion about
 how it's intended to work for the cases where it can.
 
 How will MSR[DE] and MSRP[DEP] be handled?
 
 How would I go about telling QEMU/KVM that I don't want this shared mode,
 because I don't want guest to interfere with the debugging I'm trying to do 
 from
 QEMU?
 
 Will guest accesses to debug registers cause a userspace exit when guest_debug
 is enabled?
 
   I think we're in a path that is slow enough already to not worry
   about performance.
   It's not just about performance, but simplicity of use, and
   consistency of API.
  
   Oh, and it looks like there already exist one reg definitions and
   implementations for most of the debug registers.
 
  For BookE? Where?
 
 arch/powerpc/kvm/booke.c: KVM_REG_PPC_IACn, KVM_REG_PPC_DACn

I tried to quickly prototype what I think we want to do (this is not tested)

---
diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index e8b3982..746b5c6 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -179,6 +179,9 @@ extern int kvmppc_xics_get_xive(struct kvm *kvm, u32 irq, 
u32 *server,
 extern int kvmppc_xics_int_on(struct kvm *kvm, u32 irq);
 extern int kvmppc_xics_int_off(struct kvm *kvm, u32 irq);
 
+void kvmppc_core_dequeue_debug(struct kvm_vcpu *vcpu);
+void kvmppc_core_queue_debug(struct kvm_vcpu *vcpu);
+
 /*
  * Cuts out inst bits with ordering according to spec.
  * That means the leftmost bit is zero. All given bits are included.
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 9f13056..0b7e4e4 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -235,6 +235,16 @@ void kvmppc_core_dequeue_dec(struct kvm_vcpu *vcpu)
clear_bit(BOOKE_IRQPRIO_DECREMENTER, vcpu-arch.pending_exceptions);
 }
 
+void kvmppc_core_queue_debug(struct kvm_vcpu *vcpu)
+{
+   kvmppc_booke_queue_irqprio(vcpu, BOOKE_IRQPRIO_DEBUG);
+}
+ 
+void kvmppc_core_dequeue_debug(struct kvm_vcpu *vcpu)
+{
+   clear_bit(BOOKE_IRQPRIO_DEBUG, vcpu-arch.pending_exceptions);
+}
+
 void kvmppc_core_queue_external(struct kvm_vcpu *vcpu,
 struct kvm_interrupt *irq)
 {
@@ -841,6 +851,20 @@ static int kvmppc_handle_debug(struct kvm_run *run, struct 
kvm_vcpu *vcpu)
struct debug_reg *dbg_reg = (vcpu-arch.shadow_dbg_reg);
u32 dbsr = vcpu-arch.dbsr;
 
+   /* Userspace (QEMU) is not using debug resource, so inject debug 
interrupt
+* directly to guest debug.
+*/
+   if (vcpu-guest_debug == 0) {
+   if (dbsr  (vcpu-arch.shared-msr  MSR_DE))
+   

Re: [PATCH 4/6] KVM: PPC: BOOK3S: HV: Use new functions for mapping/unmapping hpte in host

2014-07-02 Thread Aneesh Kumar K.V
Paul Mackerras pau...@samba.org writes:

 On Sun, Jun 29, 2014 at 04:47:33PM +0530, Aneesh Kumar K.V wrote:
 We want to use virtual page class key protection mechanism for
 indicating a MMIO mapped hpte entry or a guest hpte entry that is swapped out
 in the host. Those hptes will be marked valid, but have virtual page
 class key set to 30 or 31. These virtual page class numbers are
 configured in AMR to deny read/write. To accomodate such a change, add
 new functions that map, unmap and check whether a hpte is mapped in the
 host. This patch still use HPTE_V_VALID and HPTE_V_ABSENT and don't use
 virtual page class keys. But we want to differentiate in the code
 where we explicitly check for HPTE_V_VALID with places where we want to
 check whether the hpte is host mapped. This patch enables a closer
 review for such a change.

 [...]

  /* Check for pending invalidations under the rmap chain lock */
  if (kvm-arch.using_mmu_notifiers 
  mmu_notifier_retry(kvm, mmu_seq)) {
 -/* inval in progress, write a non-present HPTE */
 -pteh |= HPTE_V_ABSENT;
 -pteh = ~HPTE_V_VALID;
 +/*
 + * inval in progress in host, write host unmapped pte.
 + */
 +host_unmapped_hpte = 1;

 This isn't right.  We already have HPTE_V_VALID set here, and you now
 don't clear it here, and it doesn't get cleared by the
 __kvmppc_unmap_host_hpte() call below either.


Ok missed that. Will fix that in the next update. In the earlier version
I had kvmppc_unmap_host_hpte always clearing V_VALID. 

-aneesh

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/6] KVM: PPC: BOOK3S: Use hpte_update_in_progress to track invalid hpte during an hpte update

2014-07-02 Thread Aneesh Kumar K.V
Paul Mackerras pau...@samba.org writes:

 On Sun, Jun 29, 2014 at 04:47:34PM +0530, Aneesh Kumar K.V wrote:
 As per ISA, we first need to mark hpte invalid (V=0) before we update
 the hpte lower half bits. With virtual page class key protection mechanism 
 we want
 to send any fault other than key fault to guest directly without
 searching the hash page table. But then we can get NO_HPTE fault while
 we are updating the hpte. To track that add a vm specific atomic
 variable that we check in the fault path to always send the fault
 to host.

 [...]

 @@ -750,13 +751,15 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, 
 struct kvm_vcpu *vcpu,
  r = rcbits | ~(HPTE_R_R | HPTE_R_C);
  
  if (be64_to_cpu(hptep[0])  HPTE_V_VALID) {
 -/* HPTE was previously valid, so we need to invalidate it */
 +/*
 + * If we had mapped this hpte before, we now need to
 + * invalidate that.
 + */
  unlock_rmap(rmap);
 -/* Always mark HPTE_V_ABSENT before invalidating */
 -kvmppc_unmap_host_hpte(kvm, hptep);
  kvmppc_invalidate_hpte(kvm, hptep, index);
  /* don't lose previous R and C bits */
  r |= be64_to_cpu(hptep[1])  (HPTE_R_R | HPTE_R_C);
 +hpte_invalidated = true;

 So now we're not setting the ABSENT bit before invalidating the HPTE.
 That means that another guest vcpu could do an H_ENTER which could
 think that this HPTE is free and use it for another unrelated guest
 HPTE, which would be bad...

But henter looks at HPTE_V_HVLOCK, and we keep that set through out. But
I will double the code again to make sure it is safe in the above
scenario.


 @@ -1144,8 +1149,8 @@ static int kvm_test_clear_dirty_npages(struct kvm 
 *kvm, unsigned long *rmapp)
  npages_dirty = n;
  eieio();
  }
 -kvmppc_map_host_hpte(kvm, v, r);
 -hptep[0] = cpu_to_be64(v  ~HPTE_V_HVLOCK);
 +hptep[0] = cpu_to_be64(v  ~HPTE_V_LOCK);
 +atomic_dec(kvm-arch.hpte_update_in_progress);

 Why are we using LOCK rather than HVLOCK now?  (And why didn't you
 mention this change and its rationale in the patch description?)

Sorry, that is a typo. I intend to use HPTE_V_HVLOCK.

-aneesh

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/6] KVM: PPC: BOOK3S: HV: Deny virtual page class key update via h_protect

2014-07-02 Thread Aneesh Kumar K.V
Paul Mackerras pau...@samba.org writes:

 On Sun, Jun 29, 2014 at 04:47:31PM +0530, Aneesh Kumar K.V wrote:
 This makes it consistent with h_enter where we clear the key
 bits. We also want to use virtual page class key protection mechanism
 for indicating host page fault. For that we will be using key class
 index 30 and 31. So prevent the guest from updating key bits until
 we add proper support for virtual page class protection mechanism for
 the guest. This will not have any impact for PAPR linux guest because
 Linux guest currently don't use virtual page class key protection model

 As things stand, without this patch series, we do actually have
 everything we need in the kernel for guests to use virtual page class
 keys.  Arguably we should have a capability to tell userspace how many
 storage keys the guest can use, but that's the only missing piece as
 far as I can see.

yes.


 If we add such a capability, I can't see any reason why we should need
 to disable guest use of storage keys in this patchset.

With this patchset, we would need additonal changes to find out whether the key
fault happened because of the guest's usage of the key. I was planning to do
that as an add-on series to keep the changes in this minimal. Also since
linux didn't use keys i was not sure whether guest support of keys is an
important item.

-aneesh

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/2] KVM : powerpc/booke: Allow debug interrupt injection to guest

2014-07-02 Thread bharat.bhus...@freescale.com


 -Original Message-
 From: Bhushan Bharat-R65777
 Sent: Wednesday, July 02, 2014 5:07 PM
 To: Wood Scott-B07421; Alexander Graf
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org
 Subject: RE: [PATCH 2/2] KVM : powerpc/booke: Allow debug interrupt injection 
 to
 guest
 
 
  -Original Message-
  From: Wood Scott-B07421
  Sent: Tuesday, July 01, 2014 10:11 PM
  To: Alexander Graf
  Cc: Bhushan Bharat-R65777; kvm-ppc@vger.kernel.org;
  k...@vger.kernel.org
  Subject: Re: [PATCH 2/2] KVM : powerpc/booke: Allow debug interrupt
  injection to guest
 
  On Tue, 2014-07-01 at 18:22 +0200, Alexander Graf wrote:
   On 01.07.14 17:35, Scott Wood wrote:
On Tue, 2014-07-01 at 17:04 +0200, Alexander Graf wrote:
On 01.07.14 16:58, Scott Wood wrote:
On Tue, 2014-07-01 at 08:23 +0200, Alexander Graf wrote:
I don't think QEMU should be aware of these limitations.
OK, but we should at least have some idea of how the whole thing
is supposed to work, in order to determine if this is the
correct behavior for QEMU.  I thought the model was that debug
resources are either owned by QEMU or by the guest, and in the
latter case, QEMU would never see the debug exception to begin with.
That's bad for a number of reasons. For starters it's different
from how
x86 handles debug registers - and I hate to be different just for
the sake of being different.
How does it work on x86?
  
   It overwrites more-or-less random breakpoints with its own ones, but
   leaves the others intact ;).
 
  Are you talking about software breakpoints or management of hardware
  debug registers?
 
So if we do want to declare that debug registers are owned by
either QEMU or the guest, we should change the semantics for all
architectures.
If we want to say that ownership of the registers is shared, we
need a plan for how that would actually work.
  
   I think you're overengineering here :). When do people actually use
   gdbstub? Usually when they want to debug a broken guest. We can
   either
  
  * overengineer heavily and reduce the number of registers
   available to the guest to always have spares
  * overengineer a bit and turn off guest debugging completely when
   we use gdbstub
  * just leave as much alive as we can, hoping that it helps with
   the debugging
  
   Option 3 is what x86 does - and I think it's a reasonable approach.
   This is not an interface that needs to be 100% consistent and bullet
   proof, it's a best effort to enable you to debug as much as possible.
 
  I'm not insisting on 100% -- just hoping for some
  explanation/discussion about how it's intended to work for the cases where 
  it
 can.
 
  How will MSR[DE] and MSRP[DEP] be handled?
 
  How would I go about telling QEMU/KVM that I don't want this shared
  mode, because I don't want guest to interfere with the debugging I'm
  trying to do from QEMU?
 
  Will guest accesses to debug registers cause a userspace exit when
  guest_debug is enabled?
 
I think we're in a path that is slow enough already to not worry
about performance.
It's not just about performance, but simplicity of use, and
consistency of API.
   
Oh, and it looks like there already exist one reg definitions and
implementations for most of the debug registers.
  
   For BookE? Where?
 
  arch/powerpc/kvm/booke.c: KVM_REG_PPC_IACn, KVM_REG_PPC_DACn
 
 I tried to quickly prototype what I think we want to do (this is not tested)

Hi Scott,

There is one problem which is stopping us to share debug resource between qemu 
and guest, looking for suggestions:
- As qemu is also using debug resource,  We have to set MSR_DE and set MSRP_DEP 
(guest will not be able to clear MSR_DE). So qemu set debug events will always 
cause the debug interrupts.
- Now guest is also using debug resources and for some reason if guest wants to 
clear MSR_DE (disable debug interrupt) But it will not be able to disable as 
MSRP_DEP is set and KVM will not come to know guest willingness to disable 
MSR_DE.
- If the debug interrupts occurs then we will exit to QEMU and this may not a 
QEMU set event so it will inject interrupt to guest (using one-reg or set-sregs)
- Now KVM, when handling one-reg/sregs request to inject debug interrupt, do 
not know whether guest can handle the debug interrupt or not (as guest might 
have tried to set/clear MSR_DE).

Thanks
-Bharat

 
 ---
 diff --git a/arch/powerpc/include/asm/kvm_ppc.h
 b/arch/powerpc/include/asm/kvm_ppc.h
 index e8b3982..746b5c6 100644
 --- a/arch/powerpc/include/asm/kvm_ppc.h
 +++ b/arch/powerpc/include/asm/kvm_ppc.h
 @@ -179,6 +179,9 @@ extern int kvmppc_xics_get_xive(struct kvm *kvm, u32 irq,
 u32 *server,  extern int kvmppc_xics_int_on(struct kvm *kvm, u32 irq);  extern
 int kvmppc_xics_int_off(struct kvm *kvm, u32 irq);
 
 +void kvmppc_core_dequeue_debug(struct kvm_vcpu *vcpu); void