Re: [PATCH 41/70] x86/sev-es: Add Runtime #VC Exception Handler

2020-03-19 Thread Joerg Roedel
On Thu, Mar 19, 2020 at 11:43:20AM -0700, Andy Lutomirski wrote:
> Or future generations could have enough hardware support for debugging
> that #DB doesn't need to be intercepted or can be re-injected
> correctly with the #DB vector.

Yeah, the problem is, the GHCB spec suggests the single-step-over-iret
way to re-enable the NMI window and requires intercepting #DB for it. So
the hypervisor probably still has to intercept it, even when debug
support is added some day. I need to think more about this.

Regards,

Joerg
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 41/70] x86/sev-es: Add Runtime #VC Exception Handler

2020-03-19 Thread Andy Lutomirski
On Thu, Mar 19, 2020 at 9:24 AM Joerg Roedel  wrote:
>
> On Thu, Mar 19, 2020 at 08:44:03AM -0700, Andy Lutomirski wrote:
> > On Thu, Mar 19, 2020 at 2:14 AM Joerg Roedel  wrote:
> > >
> > > From: Tom Lendacky 
> > >
> > > Add the handler for #VC exceptions invoked at runtime.
> >
> > If I read this correctly, this does not use IST.  If that's true, I
> > don't see how this can possibly work.  There at least two nasty cases
> > that come to mind:
> >
> > 1. SYSCALL followed by NMI.  The NMI IRET hack gets to #VC and we
> > explode.  This is fixable by getting rid of the NMI EFLAGS.TF hack.
>
> Not an issue in this patch-set, the confusion comes from the fact that I
> left some parts of the single-step-over-iret code in the patch. But it
> is not used. The NMI handling in this patch-set sends the NMI-complete
> message before the IRET, when the kernel is still in a safe environment
> (kernel stack, kernel cr3).

Got it!

>
> > 2. tools/testing/selftests/x86/mov_ss_trap_64.  User code does MOV
> > (addr), SS; SYSCALL, where addr has a data breakpoint.  We get #DB
> > promoted to #VC with no stack.
>
> Also not an issue, as debugging is not supported at the moment in SEV-ES
> guests (hardware has no way yet to save/restore the debug registers
> across #VMEXITs). But this will change with future hardware. If you look
> at the implementation for dr7 read/write events, you see that the dr7
> value is cached and returned, but does not make it to the hardware dr7.

Eek.  This would probably benefit from some ptrace / perf logic to
prevent the kernel or userspace from thinking that debugging works.

I guess this means that #DB only happens due to TF or INT01.  I
suppose this is probably okay.

>
> I though about using IST for the #VC handler, but the implications for
> nesting #VC handlers made me decide against it. But for future hardware
> that supports debugging inside SEV-ES guests it will be an issue. I'll
> think about how to fix the problem, it probably has to be IST :(

Or future generations could have enough hardware support for debugging
that #DB doesn't need to be intercepted or can be re-injected
correctly with the #DB vector.

>
> Regards,
>
> Joerg
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 41/70] x86/sev-es: Add Runtime #VC Exception Handler

2020-03-19 Thread Joerg Roedel
On Thu, Mar 19, 2020 at 08:44:03AM -0700, Andy Lutomirski wrote:
> On Thu, Mar 19, 2020 at 2:14 AM Joerg Roedel  wrote:
> >
> > From: Tom Lendacky 
> >
> > Add the handler for #VC exceptions invoked at runtime.
> 
> If I read this correctly, this does not use IST.  If that's true, I
> don't see how this can possibly work.  There at least two nasty cases
> that come to mind:
> 
> 1. SYSCALL followed by NMI.  The NMI IRET hack gets to #VC and we
> explode.  This is fixable by getting rid of the NMI EFLAGS.TF hack.

Not an issue in this patch-set, the confusion comes from the fact that I
left some parts of the single-step-over-iret code in the patch. But it
is not used. The NMI handling in this patch-set sends the NMI-complete
message before the IRET, when the kernel is still in a safe environment
(kernel stack, kernel cr3).

> 2. tools/testing/selftests/x86/mov_ss_trap_64.  User code does MOV
> (addr), SS; SYSCALL, where addr has a data breakpoint.  We get #DB
> promoted to #VC with no stack.

Also not an issue, as debugging is not supported at the moment in SEV-ES
guests (hardware has no way yet to save/restore the debug registers
across #VMEXITs). But this will change with future hardware. If you look
at the implementation for dr7 read/write events, you see that the dr7
value is cached and returned, but does not make it to the hardware dr7.

I though about using IST for the #VC handler, but the implications for
nesting #VC handlers made me decide against it. But for future hardware
that supports debugging inside SEV-ES guests it will be an issue. I'll
think about how to fix the problem, it probably has to be IST :(

Regards,

Joerg
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 41/70] x86/sev-es: Add Runtime #VC Exception Handler

2020-03-19 Thread Andy Lutomirski
On Thu, Mar 19, 2020 at 2:14 AM Joerg Roedel  wrote:
>
> From: Tom Lendacky 
>
> Add the handler for #VC exceptions invoked at runtime.

If I read this correctly, this does not use IST.  If that's true, I
don't see how this can possibly work.  There at least two nasty cases
that come to mind:

1. SYSCALL followed by NMI.  The NMI IRET hack gets to #VC and we
explode.  This is fixable by getting rid of the NMI EFLAGS.TF hack.

2. tools/testing/selftests/x86/mov_ss_trap_64.  User code does MOV
(addr), SS; SYSCALL, where addr has a data breakpoint.  We get #DB
promoted to #VC with no stack.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 41/70] x86/sev-es: Add Runtime #VC Exception Handler

2020-03-19 Thread Joerg Roedel
From: Tom Lendacky 

Add the handler for #VC exceptions invoked at runtime.

Signed-off-by: Tom Lendacky 
Signed-off-by: Joerg Roedel 
---
 arch/x86/entry/entry_64.S|  4 ++
 arch/x86/include/asm/traps.h |  7 
 arch/x86/kernel/idt.c|  4 +-
 arch/x86/kernel/sev-es.c | 77 +++-
 4 files changed, 90 insertions(+), 2 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index f2bb91e87877..729876d368c5 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1210,6 +1210,10 @@ idtentry async_page_faultdo_async_page_fault 
has_error_code=1read_cr2=1
 idtentry machine_check do_mce  has_error_code=0
paranoid=1
 #endif
 
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+idtentry vmm_communication do_vmm_communicationhas_error_code=1
+#endif
+
 /*
  * Save all registers in pt_regs, and switch gs if needed.
  * Use slow, but surefire "are we in kernel?" check.
diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
index 2aa786484bb1..1be25c065698 100644
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -35,6 +35,9 @@ asmlinkage void alignment_check(void);
 #ifdef CONFIG_X86_MCE
 asmlinkage void machine_check(void);
 #endif /* CONFIG_X86_MCE */
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+asmlinkage void vmm_communication(void);
+#endif
 asmlinkage void simd_coprocessor_error(void);
 
 #if defined(CONFIG_X86_64) && defined(CONFIG_XEN_PV)
@@ -93,6 +96,10 @@ dotraplinkage void do_alignment_check(struct pt_regs *regs, 
long error_code);
 dotraplinkage void do_machine_check(struct pt_regs *regs, long error_code);
 #endif
 dotraplinkage void do_simd_coprocessor_error(struct pt_regs *regs, long 
error_code);
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+dotraplinkage void do_vmm_communication_error(struct pt_regs *regs,
+ long error_code);
+#endif
 #ifdef CONFIG_X86_32
 dotraplinkage void do_iret_error(struct pt_regs *regs, long error_code);
 #endif
diff --git a/arch/x86/kernel/idt.c b/arch/x86/kernel/idt.c
index 135d208a2d38..25fa8ba70993 100644
--- a/arch/x86/kernel/idt.c
+++ b/arch/x86/kernel/idt.c
@@ -88,8 +88,10 @@ static const __initconst struct idt_data def_idts[] = {
 #ifdef CONFIG_X86_MCE
INTG(X86_TRAP_MC,   _check),
 #endif
-
SYSG(X86_TRAP_OF,   overflow),
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+   INTG(X86_TRAP_VC,   vmm_communication),
+#endif
 #if defined(CONFIG_IA32_EMULATION)
SYSG(IA32_SYSCALL_VECTOR,   entry_INT80_compat),
 #elif defined(CONFIG_X86_32)
diff --git a/arch/x86/kernel/sev-es.c b/arch/x86/kernel/sev-es.c
index 4bf5286310a0..97241d2f0f70 100644
--- a/arch/x86/kernel/sev-es.c
+++ b/arch/x86/kernel/sev-es.c
@@ -20,7 +20,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 /* For early boot hypervisor communication in SEV-ES enabled guests */
@@ -251,6 +251,81 @@ static enum es_result vc_handle_exitcode(struct es_em_ctxt 
*ctxt,
return result;
 }
 
+static void vc_forward_exception(struct es_em_ctxt *ctxt)
+{
+   long error_code = ctxt->fi.error_code;
+   int trapnr = ctxt->fi.vector;
+
+   ctxt->regs->orig_ax = ctxt->fi.error_code;
+
+   switch (trapnr) {
+   case X86_TRAP_GP:
+   do_general_protection(ctxt->regs, error_code);
+   break;
+   case X86_TRAP_UD:
+   do_invalid_op(ctxt->regs, 0);
+   break;
+   default:
+   BUG();
+   }
+}
+
+dotraplinkage void do_vmm_communication(struct pt_regs *regs, unsigned long 
exit_code)
+{
+   struct es_em_ctxt ctxt;
+   enum es_result result;
+   struct ghcb *ghcb;
+
+   /*
+* This is invoked through an interrupt gate, so IRQs are disabled. The
+* code below might walk page-tables for user or kernel addresses, so
+* keep the IRQs disabled to protect us against concurrent TLB flushes.
+*/
+
+   ghcb = (struct ghcb *)this_cpu_ptr(ghcb_page);
+
+   vc_ghcb_invalidate(ghcb);
+   result = vc_init_em_ctxt(, regs, exit_code);
+
+   if (result == ES_OK)
+   result = vc_handle_exitcode(, ghcb, exit_code);
+
+   /* Done - now check the result */
+   switch (result) {
+   case ES_OK:
+   vc_finish_insn();
+   break;
+   case ES_UNSUPPORTED:
+   pr_emerg("Unsupported exit-code 0x%02lx in early #VC exception 
(IP: 0x%lx)\n",
+exit_code, regs->ip);
+   goto fail;
+   case ES_VMM_ERROR:
+   pr_emerg("PANIC: Failure in communication with VMM (exit-code 
0x%02lx IP: 0x%lx)\n",
+exit_code, regs->ip);
+   goto fail;
+   case ES_DECODE_FAILED:
+   pr_emerg("PANIC: Failed to decode instruction (exit-code 
0x%02lx IP: 0x%lx)\n",
+exit_code, regs->ip);
+