RE: [PATCH 2/2] KVM : powerpc/booke: Allow debug interrupt injection to guest
-Original Message- From: Wood Scott-B07421 Sent: Tuesday, July 01, 2014 10:11 PM To: Alexander Graf Cc: Bhushan Bharat-R65777; kvm-ppc@vger.kernel.org; k...@vger.kernel.org Subject: Re: [PATCH 2/2] KVM : powerpc/booke: Allow debug interrupt injection to guest On Tue, 2014-07-01 at 18:22 +0200, Alexander Graf wrote: On 01.07.14 17:35, Scott Wood wrote: On Tue, 2014-07-01 at 17:04 +0200, Alexander Graf wrote: On 01.07.14 16:58, Scott Wood wrote: On Tue, 2014-07-01 at 08:23 +0200, Alexander Graf wrote: I don't think QEMU should be aware of these limitations. OK, but we should at least have some idea of how the whole thing is supposed to work, in order to determine if this is the correct behavior for QEMU. I thought the model was that debug resources are either owned by QEMU or by the guest, and in the latter case, QEMU would never see the debug exception to begin with. That's bad for a number of reasons. For starters it's different from how x86 handles debug registers - and I hate to be different just for the sake of being different. How does it work on x86? It overwrites more-or-less random breakpoints with its own ones, but leaves the others intact ;). Are you talking about software breakpoints or management of hardware debug registers? So if we do want to declare that debug registers are owned by either QEMU or the guest, we should change the semantics for all architectures. If we want to say that ownership of the registers is shared, we need a plan for how that would actually work. I think you're overengineering here :). When do people actually use gdbstub? Usually when they want to debug a broken guest. We can either * overengineer heavily and reduce the number of registers available to the guest to always have spares * overengineer a bit and turn off guest debugging completely when we use gdbstub * just leave as much alive as we can, hoping that it helps with the debugging Option 3 is what x86 does - and I think it's a reasonable approach. This is not an interface that needs to be 100% consistent and bullet proof, it's a best effort to enable you to debug as much as possible. I'm not insisting on 100% -- just hoping for some explanation/discussion about how it's intended to work for the cases where it can. How will MSR[DE] and MSRP[DEP] be handled? How would I go about telling QEMU/KVM that I don't want this shared mode, because I don't want guest to interfere with the debugging I'm trying to do from QEMU? Will guest accesses to debug registers cause a userspace exit when guest_debug is enabled? I think we're in a path that is slow enough already to not worry about performance. It's not just about performance, but simplicity of use, and consistency of API. Oh, and it looks like there already exist one reg definitions and implementations for most of the debug registers. For BookE? Where? arch/powerpc/kvm/booke.c: KVM_REG_PPC_IACn, KVM_REG_PPC_DACn I tried to quickly prototype what I think we want to do (this is not tested) --- diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index e8b3982..746b5c6 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -179,6 +179,9 @@ extern int kvmppc_xics_get_xive(struct kvm *kvm, u32 irq, u32 *server, extern int kvmppc_xics_int_on(struct kvm *kvm, u32 irq); extern int kvmppc_xics_int_off(struct kvm *kvm, u32 irq); +void kvmppc_core_dequeue_debug(struct kvm_vcpu *vcpu); +void kvmppc_core_queue_debug(struct kvm_vcpu *vcpu); + /* * Cuts out inst bits with ordering according to spec. * That means the leftmost bit is zero. All given bits are included. diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 9f13056..0b7e4e4 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -235,6 +235,16 @@ void kvmppc_core_dequeue_dec(struct kvm_vcpu *vcpu) clear_bit(BOOKE_IRQPRIO_DECREMENTER, vcpu-arch.pending_exceptions); } +void kvmppc_core_queue_debug(struct kvm_vcpu *vcpu) +{ + kvmppc_booke_queue_irqprio(vcpu, BOOKE_IRQPRIO_DEBUG); +} + +void kvmppc_core_dequeue_debug(struct kvm_vcpu *vcpu) +{ + clear_bit(BOOKE_IRQPRIO_DEBUG, vcpu-arch.pending_exceptions); +} + void kvmppc_core_queue_external(struct kvm_vcpu *vcpu, struct kvm_interrupt *irq) { @@ -841,6 +851,20 @@ static int kvmppc_handle_debug(struct kvm_run *run, struct kvm_vcpu *vcpu) struct debug_reg *dbg_reg = (vcpu-arch.shadow_dbg_reg); u32 dbsr = vcpu-arch.dbsr; + /* Userspace (QEMU) is not using debug resource, so inject debug interrupt +* directly to guest debug. +*/ + if (vcpu-guest_debug == 0) { + if (dbsr (vcpu-arch.shared-msr MSR_DE)) +
Re: [PATCH 4/6] KVM: PPC: BOOK3S: HV: Use new functions for mapping/unmapping hpte in host
Paul Mackerras pau...@samba.org writes: On Sun, Jun 29, 2014 at 04:47:33PM +0530, Aneesh Kumar K.V wrote: We want to use virtual page class key protection mechanism for indicating a MMIO mapped hpte entry or a guest hpte entry that is swapped out in the host. Those hptes will be marked valid, but have virtual page class key set to 30 or 31. These virtual page class numbers are configured in AMR to deny read/write. To accomodate such a change, add new functions that map, unmap and check whether a hpte is mapped in the host. This patch still use HPTE_V_VALID and HPTE_V_ABSENT and don't use virtual page class keys. But we want to differentiate in the code where we explicitly check for HPTE_V_VALID with places where we want to check whether the hpte is host mapped. This patch enables a closer review for such a change. [...] /* Check for pending invalidations under the rmap chain lock */ if (kvm-arch.using_mmu_notifiers mmu_notifier_retry(kvm, mmu_seq)) { -/* inval in progress, write a non-present HPTE */ -pteh |= HPTE_V_ABSENT; -pteh = ~HPTE_V_VALID; +/* + * inval in progress in host, write host unmapped pte. + */ +host_unmapped_hpte = 1; This isn't right. We already have HPTE_V_VALID set here, and you now don't clear it here, and it doesn't get cleared by the __kvmppc_unmap_host_hpte() call below either. Ok missed that. Will fix that in the next update. In the earlier version I had kvmppc_unmap_host_hpte always clearing V_VALID. -aneesh -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 5/6] KVM: PPC: BOOK3S: Use hpte_update_in_progress to track invalid hpte during an hpte update
Paul Mackerras pau...@samba.org writes: On Sun, Jun 29, 2014 at 04:47:34PM +0530, Aneesh Kumar K.V wrote: As per ISA, we first need to mark hpte invalid (V=0) before we update the hpte lower half bits. With virtual page class key protection mechanism we want to send any fault other than key fault to guest directly without searching the hash page table. But then we can get NO_HPTE fault while we are updating the hpte. To track that add a vm specific atomic variable that we check in the fault path to always send the fault to host. [...] @@ -750,13 +751,15 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu, r = rcbits | ~(HPTE_R_R | HPTE_R_C); if (be64_to_cpu(hptep[0]) HPTE_V_VALID) { -/* HPTE was previously valid, so we need to invalidate it */ +/* + * If we had mapped this hpte before, we now need to + * invalidate that. + */ unlock_rmap(rmap); -/* Always mark HPTE_V_ABSENT before invalidating */ -kvmppc_unmap_host_hpte(kvm, hptep); kvmppc_invalidate_hpte(kvm, hptep, index); /* don't lose previous R and C bits */ r |= be64_to_cpu(hptep[1]) (HPTE_R_R | HPTE_R_C); +hpte_invalidated = true; So now we're not setting the ABSENT bit before invalidating the HPTE. That means that another guest vcpu could do an H_ENTER which could think that this HPTE is free and use it for another unrelated guest HPTE, which would be bad... But henter looks at HPTE_V_HVLOCK, and we keep that set through out. But I will double the code again to make sure it is safe in the above scenario. @@ -1144,8 +1149,8 @@ static int kvm_test_clear_dirty_npages(struct kvm *kvm, unsigned long *rmapp) npages_dirty = n; eieio(); } -kvmppc_map_host_hpte(kvm, v, r); -hptep[0] = cpu_to_be64(v ~HPTE_V_HVLOCK); +hptep[0] = cpu_to_be64(v ~HPTE_V_LOCK); +atomic_dec(kvm-arch.hpte_update_in_progress); Why are we using LOCK rather than HVLOCK now? (And why didn't you mention this change and its rationale in the patch description?) Sorry, that is a typo. I intend to use HPTE_V_HVLOCK. -aneesh -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/6] KVM: PPC: BOOK3S: HV: Deny virtual page class key update via h_protect
Paul Mackerras pau...@samba.org writes: On Sun, Jun 29, 2014 at 04:47:31PM +0530, Aneesh Kumar K.V wrote: This makes it consistent with h_enter where we clear the key bits. We also want to use virtual page class key protection mechanism for indicating host page fault. For that we will be using key class index 30 and 31. So prevent the guest from updating key bits until we add proper support for virtual page class protection mechanism for the guest. This will not have any impact for PAPR linux guest because Linux guest currently don't use virtual page class key protection model As things stand, without this patch series, we do actually have everything we need in the kernel for guests to use virtual page class keys. Arguably we should have a capability to tell userspace how many storage keys the guest can use, but that's the only missing piece as far as I can see. yes. If we add such a capability, I can't see any reason why we should need to disable guest use of storage keys in this patchset. With this patchset, we would need additonal changes to find out whether the key fault happened because of the guest's usage of the key. I was planning to do that as an add-on series to keep the changes in this minimal. Also since linux didn't use keys i was not sure whether guest support of keys is an important item. -aneesh -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/2] KVM : powerpc/booke: Allow debug interrupt injection to guest
-Original Message- From: Bhushan Bharat-R65777 Sent: Wednesday, July 02, 2014 5:07 PM To: Wood Scott-B07421; Alexander Graf Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org Subject: RE: [PATCH 2/2] KVM : powerpc/booke: Allow debug interrupt injection to guest -Original Message- From: Wood Scott-B07421 Sent: Tuesday, July 01, 2014 10:11 PM To: Alexander Graf Cc: Bhushan Bharat-R65777; kvm-ppc@vger.kernel.org; k...@vger.kernel.org Subject: Re: [PATCH 2/2] KVM : powerpc/booke: Allow debug interrupt injection to guest On Tue, 2014-07-01 at 18:22 +0200, Alexander Graf wrote: On 01.07.14 17:35, Scott Wood wrote: On Tue, 2014-07-01 at 17:04 +0200, Alexander Graf wrote: On 01.07.14 16:58, Scott Wood wrote: On Tue, 2014-07-01 at 08:23 +0200, Alexander Graf wrote: I don't think QEMU should be aware of these limitations. OK, but we should at least have some idea of how the whole thing is supposed to work, in order to determine if this is the correct behavior for QEMU. I thought the model was that debug resources are either owned by QEMU or by the guest, and in the latter case, QEMU would never see the debug exception to begin with. That's bad for a number of reasons. For starters it's different from how x86 handles debug registers - and I hate to be different just for the sake of being different. How does it work on x86? It overwrites more-or-less random breakpoints with its own ones, but leaves the others intact ;). Are you talking about software breakpoints or management of hardware debug registers? So if we do want to declare that debug registers are owned by either QEMU or the guest, we should change the semantics for all architectures. If we want to say that ownership of the registers is shared, we need a plan for how that would actually work. I think you're overengineering here :). When do people actually use gdbstub? Usually when they want to debug a broken guest. We can either * overengineer heavily and reduce the number of registers available to the guest to always have spares * overengineer a bit and turn off guest debugging completely when we use gdbstub * just leave as much alive as we can, hoping that it helps with the debugging Option 3 is what x86 does - and I think it's a reasonable approach. This is not an interface that needs to be 100% consistent and bullet proof, it's a best effort to enable you to debug as much as possible. I'm not insisting on 100% -- just hoping for some explanation/discussion about how it's intended to work for the cases where it can. How will MSR[DE] and MSRP[DEP] be handled? How would I go about telling QEMU/KVM that I don't want this shared mode, because I don't want guest to interfere with the debugging I'm trying to do from QEMU? Will guest accesses to debug registers cause a userspace exit when guest_debug is enabled? I think we're in a path that is slow enough already to not worry about performance. It's not just about performance, but simplicity of use, and consistency of API. Oh, and it looks like there already exist one reg definitions and implementations for most of the debug registers. For BookE? Where? arch/powerpc/kvm/booke.c: KVM_REG_PPC_IACn, KVM_REG_PPC_DACn I tried to quickly prototype what I think we want to do (this is not tested) Hi Scott, There is one problem which is stopping us to share debug resource between qemu and guest, looking for suggestions: - As qemu is also using debug resource, We have to set MSR_DE and set MSRP_DEP (guest will not be able to clear MSR_DE). So qemu set debug events will always cause the debug interrupts. - Now guest is also using debug resources and for some reason if guest wants to clear MSR_DE (disable debug interrupt) But it will not be able to disable as MSRP_DEP is set and KVM will not come to know guest willingness to disable MSR_DE. - If the debug interrupts occurs then we will exit to QEMU and this may not a QEMU set event so it will inject interrupt to guest (using one-reg or set-sregs) - Now KVM, when handling one-reg/sregs request to inject debug interrupt, do not know whether guest can handle the debug interrupt or not (as guest might have tried to set/clear MSR_DE). Thanks -Bharat --- diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index e8b3982..746b5c6 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -179,6 +179,9 @@ extern int kvmppc_xics_get_xive(struct kvm *kvm, u32 irq, u32 *server, extern int kvmppc_xics_int_on(struct kvm *kvm, u32 irq); extern int kvmppc_xics_int_off(struct kvm *kvm, u32 irq); +void kvmppc_core_dequeue_debug(struct kvm_vcpu *vcpu); void