Re: [PATCH v5 2/2] KVM: PPC: Exit guest upon MCE when FWNMI capability is enabled
On 2017-01-27 14:14:49 Fri, Paul Mackerras wrote: > On Wed, Jan 18, 2017 at 11:19:26AM +0530, Mahesh Jagannath Salgaonkar wrote: > > On 01/16/2017 10:05 AM, Paul Mackerras wrote: > > > On Fri, Jan 13, 2017 at 04:51:45PM +0530, Aravinda Prasad wrote: > > [snip] > > > >> case BOOK3S_INTERRUPT_MACHINE_CHECK: > > >> +/* Exit to guest with KVM_EXIT_NMI as exit reason */ > > >> +run->exit_reason = KVM_EXIT_NMI; > > >> +r = RESUME_HOST; > > >> /* > > >> - * Deliver a machine check interrupt to the guest. > > >> - * We have to do this, even if the host has handled the > > >> - * machine check, because machine checks use SRR0/1 and > > >> - * the interrupt might have trashed guest state in them. > > >> + * Invoke host-kernel handler to perform any host-side > > >> + * handling before exiting the guest. > > >> */ > > >> -kvmppc_book3s_queue_irqprio(vcpu, > > >> - > > >> BOOK3S_INTERRUPT_MACHINE_CHECK); > > >> -r = RESUME_GUEST; > > >> +kvmppc_machine_check_hook(); > > > > > > Note that this won't necessarily be called on the same CPU that > > > received the machine check. This will be called on thread 0 of the > > > core (or subcore), whereas the machine check could have occurred on > > > some other thread. Are you sure that the machine check handling code > > > will be OK with that? > > > > That will have only one problem. get_mce_event() from > > opal_machine_check() may not be able to pull mce event for error on > > non-zero thread. We should hook the mce event into vcpu structure during > > kvmppc_realmode_machine_check() and then pass it to > > ppc_md.machine_check_exception() as an additional argument. > > To move things along... > > Mahesh, how would we get hold of the mce event from real-mode assembly > code? What function would we need to call to get the event? Could > you write some code (or at least some pseudo-code) to illustrate how > it would be done? I am thinking of passing all the handled/unhandled mce errors to guest if fwnmi capability is supported by QEMU. Otherwise fall back to the old behaviour. I have done modifications to this patch which is almost ready. Planning to pass additional info to QEMU about whether MCE error was recovered or not, so that QEMU can correctly set the RTAS DISPOSITION. Below is the modified patch. It uses one bit from kvm_run->flags to pass disposition information (0 = unrecovred, 1 = recovred). But RTAS disposition can have 3 values as per PAPR. The 3rd value indicates limited recovery. For PAPR compliance, I am thiniking of using two bits from kvm_run->flags (b00 = Fully recovered, b01 = Limited recovery, b10= Not recovered.). Let me know what you think about this approach. -- Enhance KVM to cause a guest exit with KVM_EXIT_NMI From: Aravinda Prasadexit reason upon a machine check exception (MCE) in the guest address space if the KVM_CAP_PPC_FWNMI capability is enabled (instead of delivering a 0x200 interrupt to guest). This enables QEMU to build error log and deliver machine check exception to guest via guest registered machine check handler. This approach simplifies the delivery of machine check exception to guest OS compared to the earlier approach of KVM directly invoking 0x200 guest interrupt vector. This design/approach is based on the feedback for the QEMU patches to handle machine check exception. Details of earlier approach of handling machine check exception in QEMU and related discussions can be found at: https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html Note: This patch introduces a hook which is invoked at the time of guest exit to facilitate the host-side handling of machine check exception before the exception is passed on to the guest. Hence, the host-side handling which was performed earlier via machine_check_fwnmi is removed. The reasons for this approach is (i) it is not possible to distinguish whether the exception occurred in the guest or the host from the pt_regs passed on the machine_check_exception(). Hence machine_check_exception() calls panic, instead of passing on the exception to the guest, if the machine check exception is not recoverable. (ii) the approach introduced in this patch gives opportunity to the host kernel to perform actions in virtual mode before passing on the exception to the guest. This approach does not require complex tweaks to machine_check_fwnmi and friends. Signed-off-by: Aravinda Prasad Reviewed-by: David Gibson Signed-off-by: Mahesh Salgaonkar --- arch/powerpc/include/asm/kvm_host.h |2 + arch/powerpc/include/asm/machdep.h |
Re: [PATCH v5 2/2] KVM: PPC: Exit guest upon MCE when FWNMI capability is enabled
On Wed, Jan 18, 2017 at 11:19:26AM +0530, Mahesh Jagannath Salgaonkar wrote: > On 01/16/2017 10:05 AM, Paul Mackerras wrote: > > On Fri, Jan 13, 2017 at 04:51:45PM +0530, Aravinda Prasad wrote: [snip] > >>case BOOK3S_INTERRUPT_MACHINE_CHECK: > >> + /* Exit to guest with KVM_EXIT_NMI as exit reason */ > >> + run->exit_reason = KVM_EXIT_NMI; > >> + r = RESUME_HOST; > >>/* > >> - * Deliver a machine check interrupt to the guest. > >> - * We have to do this, even if the host has handled the > >> - * machine check, because machine checks use SRR0/1 and > >> - * the interrupt might have trashed guest state in them. > >> + * Invoke host-kernel handler to perform any host-side > >> + * handling before exiting the guest. > >> */ > >> - kvmppc_book3s_queue_irqprio(vcpu, > >> - BOOK3S_INTERRUPT_MACHINE_CHECK); > >> - r = RESUME_GUEST; > >> + kvmppc_machine_check_hook(); > > > > Note that this won't necessarily be called on the same CPU that > > received the machine check. This will be called on thread 0 of the > > core (or subcore), whereas the machine check could have occurred on > > some other thread. Are you sure that the machine check handling code > > will be OK with that? > > That will have only one problem. get_mce_event() from > opal_machine_check() may not be able to pull mce event for error on > non-zero thread. We should hook the mce event into vcpu structure during > kvmppc_realmode_machine_check() and then pass it to > ppc_md.machine_check_exception() as an additional argument. To move things along... Mahesh, how would we get hold of the mce event from real-mode assembly code? What function would we need to call to get the event? Could you write some code (or at least some pseudo-code) to illustrate how it would be done? Thanks, Paul.
Re: [PATCH v5 2/2] KVM: PPC: Exit guest upon MCE when FWNMI capability is enabled
On 01/16/2017 10:05 AM, Paul Mackerras wrote: > On Fri, Jan 13, 2017 at 04:51:45PM +0530, Aravinda Prasad wrote: >> Enhance KVM to cause a guest exit with KVM_EXIT_NMI >> exit reason upon a machine check exception (MCE) in >> the guest address space if the KVM_CAP_PPC_FWNMI >> capability is enabled (instead of delivering a 0x200 >> interrupt to guest). This enables QEMU to build error >> log and deliver machine check exception to guest via >> guest registered machine check handler. >> >> This approach simplifies the delivery of machine >> check exception to guest OS compared to the earlier >> approach of KVM directly invoking 0x200 guest interrupt >> vector. >> >> This design/approach is based on the feedback for the >> QEMU patches to handle machine check exception. Details >> of earlier approach of handling machine check exception >> in QEMU and related discussions can be found at: >> >> https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html >> >> Note: >> >> This patch introduces a hook which is invoked at the time >> of guest exit to facilitate the host-side handling of >> machine check exception before the exception is passed >> on to the guest. Hence, the host-side handling which was >> performed earlier via machine_check_fwnmi is removed. >> >> The reasons for this approach is (i) it is not possible >> to distinguish whether the exception occurred in the >> guest or the host from the pt_regs passed on the >> machine_check_exception(). Hence machine_check_exception() >> calls panic, instead of passing on the exception to >> the guest, if the machine check exception is not >> recoverable. (ii) the approach introduced in this >> patch gives opportunity to the host kernel to perform >> actions in virtual mode before passing on the exception >> to the guest. This approach does not require complex >> tweaks to machine_check_fwnmi and friends. >> >> Signed-off-by: Aravinda Prasad>> Reviewed-by: David Gibson > > This patch mostly looks OK. I have a few relatively minor comments > below. > >> --- >> arch/powerpc/kvm/book3s_hv.c| 27 +- >> arch/powerpc/kvm/book3s_hv_rmhandlers.S | 47 >> --- >> arch/powerpc/platforms/powernv/opal.c | 10 +++ >> 3 files changed, 54 insertions(+), 30 deletions(-) >> >> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c >> index 3686471..cae4921 100644 >> --- a/arch/powerpc/kvm/book3s_hv.c >> +++ b/arch/powerpc/kvm/book3s_hv.c >> @@ -123,6 +123,7 @@ MODULE_PARM_DESC(halt_poll_ns_shrink, "Factor halt poll >> time is shrunk by"); >> >> static void kvmppc_end_cede(struct kvm_vcpu *vcpu); >> static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu); >> +static void kvmppc_machine_check_hook(void); >> >> static inline struct kvm_vcpu *next_runnable_thread(struct kvmppc_vcore *vc, >> int *ip) >> @@ -954,15 +955,14 @@ static int kvmppc_handle_exit_hv(struct kvm_run *run, >> struct kvm_vcpu *vcpu, >> r = RESUME_GUEST; >> break; >> case BOOK3S_INTERRUPT_MACHINE_CHECK: >> +/* Exit to guest with KVM_EXIT_NMI as exit reason */ >> +run->exit_reason = KVM_EXIT_NMI; >> +r = RESUME_HOST; >> /* >> - * Deliver a machine check interrupt to the guest. >> - * We have to do this, even if the host has handled the >> - * machine check, because machine checks use SRR0/1 and >> - * the interrupt might have trashed guest state in them. >> + * Invoke host-kernel handler to perform any host-side >> + * handling before exiting the guest. >> */ >> -kvmppc_book3s_queue_irqprio(vcpu, >> -BOOK3S_INTERRUPT_MACHINE_CHECK); >> -r = RESUME_GUEST; >> +kvmppc_machine_check_hook(); > > Note that this won't necessarily be called on the same CPU that > received the machine check. This will be called on thread 0 of the > core (or subcore), whereas the machine check could have occurred on > some other thread. Are you sure that the machine check handling code > will be OK with that? That will have only one problem. get_mce_event() from opal_machine_check() may not be able to pull mce event for error on non-zero thread. We should hook the mce event into vcpu structure during kvmppc_realmode_machine_check() and then pass it to ppc_md.machine_check_exception() as an additional argument. > >> break; >> case BOOK3S_INTERRUPT_PROGRAM: >> { >> @@ -3491,6 +3491,19 @@ static void kvmppc_irq_bypass_del_producer_hv(struct >> irq_bypass_consumer *cons, >> } >> #endif >> >> +/* >> + * Hook to handle machine check exceptions occurred inside a guest. >> + * This hook is invoked from host virtual mode from KVM before exiting >> + * the guest with KVM_EXIT_NMI exit reason. This
Re: [PATCH v5 2/2] KVM: PPC: Exit guest upon MCE when FWNMI capability is enabled
On Fri, Jan 13, 2017 at 04:51:45PM +0530, Aravinda Prasad wrote: > Enhance KVM to cause a guest exit with KVM_EXIT_NMI > exit reason upon a machine check exception (MCE) in > the guest address space if the KVM_CAP_PPC_FWNMI > capability is enabled (instead of delivering a 0x200 > interrupt to guest). This enables QEMU to build error > log and deliver machine check exception to guest via > guest registered machine check handler. > > This approach simplifies the delivery of machine > check exception to guest OS compared to the earlier > approach of KVM directly invoking 0x200 guest interrupt > vector. > > This design/approach is based on the feedback for the > QEMU patches to handle machine check exception. Details > of earlier approach of handling machine check exception > in QEMU and related discussions can be found at: > > https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html > > Note: > > This patch introduces a hook which is invoked at the time > of guest exit to facilitate the host-side handling of > machine check exception before the exception is passed > on to the guest. Hence, the host-side handling which was > performed earlier via machine_check_fwnmi is removed. > > The reasons for this approach is (i) it is not possible > to distinguish whether the exception occurred in the > guest or the host from the pt_regs passed on the > machine_check_exception(). Hence machine_check_exception() > calls panic, instead of passing on the exception to > the guest, if the machine check exception is not > recoverable. (ii) the approach introduced in this > patch gives opportunity to the host kernel to perform > actions in virtual mode before passing on the exception > to the guest. This approach does not require complex > tweaks to machine_check_fwnmi and friends. > > Signed-off-by: Aravinda Prasad> Reviewed-by: David Gibson This patch mostly looks OK. I have a few relatively minor comments below. > --- > arch/powerpc/kvm/book3s_hv.c| 27 +- > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 47 > --- > arch/powerpc/platforms/powernv/opal.c | 10 +++ > 3 files changed, 54 insertions(+), 30 deletions(-) > > diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c > index 3686471..cae4921 100644 > --- a/arch/powerpc/kvm/book3s_hv.c > +++ b/arch/powerpc/kvm/book3s_hv.c > @@ -123,6 +123,7 @@ MODULE_PARM_DESC(halt_poll_ns_shrink, "Factor halt poll > time is shrunk by"); > > static void kvmppc_end_cede(struct kvm_vcpu *vcpu); > static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu); > +static void kvmppc_machine_check_hook(void); > > static inline struct kvm_vcpu *next_runnable_thread(struct kvmppc_vcore *vc, > int *ip) > @@ -954,15 +955,14 @@ static int kvmppc_handle_exit_hv(struct kvm_run *run, > struct kvm_vcpu *vcpu, > r = RESUME_GUEST; > break; > case BOOK3S_INTERRUPT_MACHINE_CHECK: > + /* Exit to guest with KVM_EXIT_NMI as exit reason */ > + run->exit_reason = KVM_EXIT_NMI; > + r = RESUME_HOST; > /* > - * Deliver a machine check interrupt to the guest. > - * We have to do this, even if the host has handled the > - * machine check, because machine checks use SRR0/1 and > - * the interrupt might have trashed guest state in them. > + * Invoke host-kernel handler to perform any host-side > + * handling before exiting the guest. >*/ > - kvmppc_book3s_queue_irqprio(vcpu, > - BOOK3S_INTERRUPT_MACHINE_CHECK); > - r = RESUME_GUEST; > + kvmppc_machine_check_hook(); Note that this won't necessarily be called on the same CPU that received the machine check. This will be called on thread 0 of the core (or subcore), whereas the machine check could have occurred on some other thread. Are you sure that the machine check handling code will be OK with that? > break; > case BOOK3S_INTERRUPT_PROGRAM: > { > @@ -3491,6 +3491,19 @@ static void kvmppc_irq_bypass_del_producer_hv(struct > irq_bypass_consumer *cons, > } > #endif > > +/* > + * Hook to handle machine check exceptions occurred inside a guest. > + * This hook is invoked from host virtual mode from KVM before exiting > + * the guest with KVM_EXIT_NMI exit reason. This gives an opportunity > + * for the host to take action (if any) before passing on the machine > + * check exception to the guest kernel. > + */ > +static void kvmppc_machine_check_hook(void) > +{ > + if (ppc_md.machine_check_exception) > + ppc_md.machine_check_exception(NULL); > +} What is the advantage of having this as a separate function, as opposed to just putting those two lines of code in line in the one place where
[PATCH v5 2/2] KVM: PPC: Exit guest upon MCE when FWNMI capability is enabled
Enhance KVM to cause a guest exit with KVM_EXIT_NMI exit reason upon a machine check exception (MCE) in the guest address space if the KVM_CAP_PPC_FWNMI capability is enabled (instead of delivering a 0x200 interrupt to guest). This enables QEMU to build error log and deliver machine check exception to guest via guest registered machine check handler. This approach simplifies the delivery of machine check exception to guest OS compared to the earlier approach of KVM directly invoking 0x200 guest interrupt vector. This design/approach is based on the feedback for the QEMU patches to handle machine check exception. Details of earlier approach of handling machine check exception in QEMU and related discussions can be found at: https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html Note: This patch introduces a hook which is invoked at the time of guest exit to facilitate the host-side handling of machine check exception before the exception is passed on to the guest. Hence, the host-side handling which was performed earlier via machine_check_fwnmi is removed. The reasons for this approach is (i) it is not possible to distinguish whether the exception occurred in the guest or the host from the pt_regs passed on the machine_check_exception(). Hence machine_check_exception() calls panic, instead of passing on the exception to the guest, if the machine check exception is not recoverable. (ii) the approach introduced in this patch gives opportunity to the host kernel to perform actions in virtual mode before passing on the exception to the guest. This approach does not require complex tweaks to machine_check_fwnmi and friends. Signed-off-by: Aravinda PrasadReviewed-by: David Gibson --- arch/powerpc/kvm/book3s_hv.c| 27 +- arch/powerpc/kvm/book3s_hv_rmhandlers.S | 47 --- arch/powerpc/platforms/powernv/opal.c | 10 +++ 3 files changed, 54 insertions(+), 30 deletions(-) diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 3686471..cae4921 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -123,6 +123,7 @@ MODULE_PARM_DESC(halt_poll_ns_shrink, "Factor halt poll time is shrunk by"); static void kvmppc_end_cede(struct kvm_vcpu *vcpu); static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu); +static void kvmppc_machine_check_hook(void); static inline struct kvm_vcpu *next_runnable_thread(struct kvmppc_vcore *vc, int *ip) @@ -954,15 +955,14 @@ static int kvmppc_handle_exit_hv(struct kvm_run *run, struct kvm_vcpu *vcpu, r = RESUME_GUEST; break; case BOOK3S_INTERRUPT_MACHINE_CHECK: + /* Exit to guest with KVM_EXIT_NMI as exit reason */ + run->exit_reason = KVM_EXIT_NMI; + r = RESUME_HOST; /* -* Deliver a machine check interrupt to the guest. -* We have to do this, even if the host has handled the -* machine check, because machine checks use SRR0/1 and -* the interrupt might have trashed guest state in them. +* Invoke host-kernel handler to perform any host-side +* handling before exiting the guest. */ - kvmppc_book3s_queue_irqprio(vcpu, - BOOK3S_INTERRUPT_MACHINE_CHECK); - r = RESUME_GUEST; + kvmppc_machine_check_hook(); break; case BOOK3S_INTERRUPT_PROGRAM: { @@ -3491,6 +3491,19 @@ static void kvmppc_irq_bypass_del_producer_hv(struct irq_bypass_consumer *cons, } #endif +/* + * Hook to handle machine check exceptions occurred inside a guest. + * This hook is invoked from host virtual mode from KVM before exiting + * the guest with KVM_EXIT_NMI exit reason. This gives an opportunity + * for the host to take action (if any) before passing on the machine + * check exception to the guest kernel. + */ +static void kvmppc_machine_check_hook(void) +{ + if (ppc_md.machine_check_exception) + ppc_md.machine_check_exception(NULL); +} + static long kvm_arch_vm_ioctl_hv(struct file *filp, unsigned int ioctl, unsigned long arg) { diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S index c3c1d1b..9b41390 100644 --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S @@ -134,21 +134,18 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S) stb r0, HSTATE_HWTHREAD_REQ(r13) /* -* For external and machine check interrupts, we need -* to call the Linux handler to process the interrupt. -* We do that by jumping to absolute address 0x500 for -* external interrupts, or the machine_check_fwnmi label -* for