Re: [PATCH v2 3/8] x86/enter: Use IBRS on syscall and interrupts
On Fri, Jan 05, 2018 at 06:12:18PM -0800, Tim Chen wrote: > From: Andrea Arcangeli> > Set IBRS upon kernel entrance via syscall and interrupts. Clear it > upon exit. IBRS protects against unsafe indirect branching predictions > in the kernel. > > The NMI interrupt save/restore of IBRS state was based on Andrea > Arcangeli's implementation. > Here's an explanation by Dave Hansen on why we save IBRS state for NMI. > > The normal interrupt code uses the 'error_entry' path which uses the > Code Segment (CS) of the instruction that was interrupted to tell > whether it interrupted the kernel or userspace and thus has to switch > IBRS, or leave it alone. > > The NMI code is different. It uses 'paranoid_entry' because it can > interrupt the kernel while it is running with a userspace IBRS (and %GS > and CR3) value, but has a kernel CS. If we used the same approach as > the normal interrupt code, we might do the following; > > SYSENTER_entry > <-- NMI HERE > IBRS=1 > do_something() > IBRS=0 > SYSRET > > The NMI code might notice that we are running in the kernel and decide > that it is OK to skip the IBRS=1. This would leave it running > unprotected with IBRS=0, which is bad. > > However, if we unconditionally set IBRS=1, in the NMI, we might get the > following case: > > SYSENTER_entry > IBRS=1 > do_something() > IBRS=0 > <-- NMI HERE (set IBRS=1) > SYSRET > > and we would return to userspace with IBRS=1. Userspace would run > slowly until we entered and exited the kernel again. > > Instead of those two approaches, we chose a third one where we simply > save the IBRS value in a scratch register (%r13) and then restore that > value, verbatim. That's one helluva commit message. This is how you write commit messages! > Signed-off-by: Andrea Arcangeli > Signed-off-by: Tim Chen > --- > arch/x86/entry/entry_64.S| 23 +++ > arch/x86/entry/entry_64_compat.S | 8 > 2 files changed, 31 insertions(+) Reviewed-by: Borislav Petkov -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply.
Re: [PATCH v2 3/8] x86/enter: Use IBRS on syscall and interrupts
On Fri, Jan 05, 2018 at 06:12:18PM -0800, Tim Chen wrote: > From: Andrea Arcangeli > > Set IBRS upon kernel entrance via syscall and interrupts. Clear it > upon exit. IBRS protects against unsafe indirect branching predictions > in the kernel. > > The NMI interrupt save/restore of IBRS state was based on Andrea > Arcangeli's implementation. > Here's an explanation by Dave Hansen on why we save IBRS state for NMI. > > The normal interrupt code uses the 'error_entry' path which uses the > Code Segment (CS) of the instruction that was interrupted to tell > whether it interrupted the kernel or userspace and thus has to switch > IBRS, or leave it alone. > > The NMI code is different. It uses 'paranoid_entry' because it can > interrupt the kernel while it is running with a userspace IBRS (and %GS > and CR3) value, but has a kernel CS. If we used the same approach as > the normal interrupt code, we might do the following; > > SYSENTER_entry > <-- NMI HERE > IBRS=1 > do_something() > IBRS=0 > SYSRET > > The NMI code might notice that we are running in the kernel and decide > that it is OK to skip the IBRS=1. This would leave it running > unprotected with IBRS=0, which is bad. > > However, if we unconditionally set IBRS=1, in the NMI, we might get the > following case: > > SYSENTER_entry > IBRS=1 > do_something() > IBRS=0 > <-- NMI HERE (set IBRS=1) > SYSRET > > and we would return to userspace with IBRS=1. Userspace would run > slowly until we entered and exited the kernel again. > > Instead of those two approaches, we chose a third one where we simply > save the IBRS value in a scratch register (%r13) and then restore that > value, verbatim. That's one helluva commit message. This is how you write commit messages! > Signed-off-by: Andrea Arcangeli > Signed-off-by: Tim Chen > --- > arch/x86/entry/entry_64.S| 23 +++ > arch/x86/entry/entry_64_compat.S | 8 > 2 files changed, 31 insertions(+) Reviewed-by: Borislav Petkov -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply.
[PATCH v2 3/8] x86/enter: Use IBRS on syscall and interrupts
From: Andrea ArcangeliSet IBRS upon kernel entrance via syscall and interrupts. Clear it upon exit. IBRS protects against unsafe indirect branching predictions in the kernel. The NMI interrupt save/restore of IBRS state was based on Andrea Arcangeli's implementation. Here's an explanation by Dave Hansen on why we save IBRS state for NMI. The normal interrupt code uses the 'error_entry' path which uses the Code Segment (CS) of the instruction that was interrupted to tell whether it interrupted the kernel or userspace and thus has to switch IBRS, or leave it alone. The NMI code is different. It uses 'paranoid_entry' because it can interrupt the kernel while it is running with a userspace IBRS (and %GS and CR3) value, but has a kernel CS. If we used the same approach as the normal interrupt code, we might do the following; SYSENTER_entry <-- NMI HERE IBRS=1 do_something() IBRS=0 SYSRET The NMI code might notice that we are running in the kernel and decide that it is OK to skip the IBRS=1. This would leave it running unprotected with IBRS=0, which is bad. However, if we unconditionally set IBRS=1, in the NMI, we might get the following case: SYSENTER_entry IBRS=1 do_something() IBRS=0 <-- NMI HERE (set IBRS=1) SYSRET and we would return to userspace with IBRS=1. Userspace would run slowly until we entered and exited the kernel again. Instead of those two approaches, we chose a third one where we simply save the IBRS value in a scratch register (%r13) and then restore that value, verbatim. Signed-off-by: Andrea Arcangeli Signed-off-by: Tim Chen --- arch/x86/entry/entry_64.S| 23 +++ arch/x86/entry/entry_64_compat.S | 8 2 files changed, 31 insertions(+) diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index f048e38..a4031c9 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -174,6 +174,8 @@ ENTRY(entry_SYSCALL_64_trampoline) /* Load the top of the task stack into RSP */ movqCPU_ENTRY_AREA_tss + TSS_sp1 + CPU_ENTRY_AREA, %rsp + /* Stack is usable, use the non-clobbering IBRS enable: */ + ENABLE_IBRS /* Start building the simulated IRET frame. */ pushq $__USER_DS /* pt_regs->ss */ @@ -217,6 +219,8 @@ ENTRY(entry_SYSCALL_64) */ movq%rsp, PER_CPU_VAR(rsp_scratch) movqPER_CPU_VAR(cpu_current_top_of_stack), %rsp + /* Stack is usable, use the non-clobbering IBRS enable: */ + ENABLE_IBRS /* Construct struct pt_regs on stack */ pushq $__USER_DS /* pt_regs->ss */ @@ -411,6 +415,7 @@ syscall_return_via_sysret: * We are on the trampoline stack. All regs except RDI are live. * We can do future final exit work right here. */ + DISABLE_IBRS SWITCH_TO_USER_CR3_STACK scratch_reg=%rdi popq%rdi @@ -749,6 +754,7 @@ GLOBAL(swapgs_restore_regs_and_return_to_usermode) * We can do future final exit work right here. */ + DISABLE_IBRS SWITCH_TO_USER_CR3_STACK scratch_reg=%rdi /* Restore RDI. */ @@ -836,6 +842,14 @@ native_irq_return_ldt: SWAPGS /* to kernel GS */ SWITCH_TO_KERNEL_CR3 scratch_reg=%rdi /* to kernel CR3 */ + /* +* Normally we enable IBRS when we switch to kernel's CR3. +* But we are going to switch back to user CR3 immediately +* in this routine after fixing ESPFIX stack. There is +* no vulnerable code branching for IBRS to protect. +* We don't toggle IBRS to avoid the cost of two MSR writes. +*/ + movqPER_CPU_VAR(espfix_waddr), %rdi movq%rax, (0*8)(%rdi) /* user RAX */ movq(1*8)(%rsp), %rax /* user RIP */ @@ -969,6 +983,8 @@ ENTRY(switch_to_thread_stack) SWITCH_TO_KERNEL_CR3 scratch_reg=%rdi movq%rsp, %rdi movqPER_CPU_VAR(cpu_current_top_of_stack), %rsp + /* Stack is usable, use the non-clobbering IBRS enable: */ + ENABLE_IBRS UNWIND_HINT sp_offset=16 sp_reg=ORC_REG_DI pushq 7*8(%rdi) /* regs->ss */ @@ -1271,6 +1287,7 @@ ENTRY(paranoid_entry) 1: SAVE_AND_SWITCH_TO_KERNEL_CR3 scratch_reg=%rax save_reg=%r14 + ENABLE_IBRS_SAVE_AND_CLOBBER save_reg=%r13d ret END(paranoid_entry) @@ -1294,6 +1311,7 @@ ENTRY(paranoid_exit) testl %ebx, %ebx /* swapgs needed? */ jnz .Lparanoid_exit_no_swapgs TRACE_IRQS_IRETQ + RESTORE_IBRS_CLOBBER save_reg=%r13d RESTORE_CR3 scratch_reg=%rbx save_reg=%r14 SWAPGS_UNSAFE_STACK jmp
[PATCH v2 3/8] x86/enter: Use IBRS on syscall and interrupts
From: Andrea Arcangeli Set IBRS upon kernel entrance via syscall and interrupts. Clear it upon exit. IBRS protects against unsafe indirect branching predictions in the kernel. The NMI interrupt save/restore of IBRS state was based on Andrea Arcangeli's implementation. Here's an explanation by Dave Hansen on why we save IBRS state for NMI. The normal interrupt code uses the 'error_entry' path which uses the Code Segment (CS) of the instruction that was interrupted to tell whether it interrupted the kernel or userspace and thus has to switch IBRS, or leave it alone. The NMI code is different. It uses 'paranoid_entry' because it can interrupt the kernel while it is running with a userspace IBRS (and %GS and CR3) value, but has a kernel CS. If we used the same approach as the normal interrupt code, we might do the following; SYSENTER_entry <-- NMI HERE IBRS=1 do_something() IBRS=0 SYSRET The NMI code might notice that we are running in the kernel and decide that it is OK to skip the IBRS=1. This would leave it running unprotected with IBRS=0, which is bad. However, if we unconditionally set IBRS=1, in the NMI, we might get the following case: SYSENTER_entry IBRS=1 do_something() IBRS=0 <-- NMI HERE (set IBRS=1) SYSRET and we would return to userspace with IBRS=1. Userspace would run slowly until we entered and exited the kernel again. Instead of those two approaches, we chose a third one where we simply save the IBRS value in a scratch register (%r13) and then restore that value, verbatim. Signed-off-by: Andrea Arcangeli Signed-off-by: Tim Chen --- arch/x86/entry/entry_64.S| 23 +++ arch/x86/entry/entry_64_compat.S | 8 2 files changed, 31 insertions(+) diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index f048e38..a4031c9 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -174,6 +174,8 @@ ENTRY(entry_SYSCALL_64_trampoline) /* Load the top of the task stack into RSP */ movqCPU_ENTRY_AREA_tss + TSS_sp1 + CPU_ENTRY_AREA, %rsp + /* Stack is usable, use the non-clobbering IBRS enable: */ + ENABLE_IBRS /* Start building the simulated IRET frame. */ pushq $__USER_DS /* pt_regs->ss */ @@ -217,6 +219,8 @@ ENTRY(entry_SYSCALL_64) */ movq%rsp, PER_CPU_VAR(rsp_scratch) movqPER_CPU_VAR(cpu_current_top_of_stack), %rsp + /* Stack is usable, use the non-clobbering IBRS enable: */ + ENABLE_IBRS /* Construct struct pt_regs on stack */ pushq $__USER_DS /* pt_regs->ss */ @@ -411,6 +415,7 @@ syscall_return_via_sysret: * We are on the trampoline stack. All regs except RDI are live. * We can do future final exit work right here. */ + DISABLE_IBRS SWITCH_TO_USER_CR3_STACK scratch_reg=%rdi popq%rdi @@ -749,6 +754,7 @@ GLOBAL(swapgs_restore_regs_and_return_to_usermode) * We can do future final exit work right here. */ + DISABLE_IBRS SWITCH_TO_USER_CR3_STACK scratch_reg=%rdi /* Restore RDI. */ @@ -836,6 +842,14 @@ native_irq_return_ldt: SWAPGS /* to kernel GS */ SWITCH_TO_KERNEL_CR3 scratch_reg=%rdi /* to kernel CR3 */ + /* +* Normally we enable IBRS when we switch to kernel's CR3. +* But we are going to switch back to user CR3 immediately +* in this routine after fixing ESPFIX stack. There is +* no vulnerable code branching for IBRS to protect. +* We don't toggle IBRS to avoid the cost of two MSR writes. +*/ + movqPER_CPU_VAR(espfix_waddr), %rdi movq%rax, (0*8)(%rdi) /* user RAX */ movq(1*8)(%rsp), %rax /* user RIP */ @@ -969,6 +983,8 @@ ENTRY(switch_to_thread_stack) SWITCH_TO_KERNEL_CR3 scratch_reg=%rdi movq%rsp, %rdi movqPER_CPU_VAR(cpu_current_top_of_stack), %rsp + /* Stack is usable, use the non-clobbering IBRS enable: */ + ENABLE_IBRS UNWIND_HINT sp_offset=16 sp_reg=ORC_REG_DI pushq 7*8(%rdi) /* regs->ss */ @@ -1271,6 +1287,7 @@ ENTRY(paranoid_entry) 1: SAVE_AND_SWITCH_TO_KERNEL_CR3 scratch_reg=%rax save_reg=%r14 + ENABLE_IBRS_SAVE_AND_CLOBBER save_reg=%r13d ret END(paranoid_entry) @@ -1294,6 +1311,7 @@ ENTRY(paranoid_exit) testl %ebx, %ebx /* swapgs needed? */ jnz .Lparanoid_exit_no_swapgs TRACE_IRQS_IRETQ + RESTORE_IBRS_CLOBBER save_reg=%r13d RESTORE_CR3 scratch_reg=%rbx save_reg=%r14 SWAPGS_UNSAFE_STACK jmp .Lparanoid_exit_restore @@ -1324,6 +1342,7 @@ ENTRY(error_entry) SWAPGS