* Ingo Molnar <mi...@kernel.org> wrote:

> * David Woodhouse <dw...@infradead.org> wrote:
> 
> > But wait, why did I say "mostly"? Well, not everyone has a retpoline
> > compiler yet... but OK, screw them; they need to update.
> > 
> > Then there's Skylake, and that generation of CPU cores. For complicated
> > reasons they actually end up being vulnerable not just on indirect
> > branches, but also on a 'ret' in some circumstances (such as 16+ CALLs
> > in a deep chain).
> > 
> > The IBRS solution, ugly though it is, did address that. Retpoline
> > doesn't. There are patches being floated to detect and prevent deep
> > stacks, and deal with some of the other special cases that bite on SKL,
> > but those are icky too. And in fact IBRS performance isn't anywhere
> > near as bad on this generation of CPUs as it is on earlier CPUs
> > *anyway*, which makes it not quite so insane to *contemplate* using it
> > as Intel proposed.
> 
> There's another possible method to avoid deep stacks on Skylake, without 
> compiler 
> support:
> 
>   - Use the existing mcount based function tracing live patching machinery
>     (CONFIG_FUNCTION_TRACER=y) to install a _very_ fast and simple stack 
> depth 
>     tracking tracer which would issue a retpoline when stack depth crosses 
>     boundaries of ~16 entries.

The patch below demonstrates the principle, it forcibly enables dynamic ftrace 
patching (CONFIG_DYNAMIC_FTRACE=y et al) and turns mcount/__fentry__ into a RET:

  ffffffff81a01a40 <__fentry__>:
  ffffffff81a01a40:       c3                      retq   

This would have to be extended with (very simple) call stack depth tracking 
(just 
3 more instructions would do in the fast path I believe) and a suitable SkyLake 
workaround (and also has to play nice with the ftrace callbacks).

On non-SkyLake the overhead would be 0 cycles.

On SkyLake this would add an overhead of maybe 2-3 cycles per function call and 
obviously all this code and data would be very cache hot. Given that the 
average 
number of function calls per system call is around a dozen, this would be 
_much_ 
faster than any microcode/MSR based approach.

Is there a testcase for the SkyLake 16-deep-call-stack problem that I could 
run? 
Is there a description of the exact speculative execution vulnerability that 
has 
to be addressed to begin with?

If this approach is workable I'd much prefer it to any MSR writes in the 
syscall 
entry path not just because it's fast enough in practice to not be turned off 
by 
everyone, but also because everyone would agree that per function call overhead 
needs to go away on new CPUs. Both deployment and backporting is also _much_ 
more 
flexible, simpler, faster and more complete than microcode/firmware or compiler 
based solutions.

Assuming the vulnerability can be addressed via this route that is, which is a 
big 
assumption!

Thanks,

        Ingo

 arch/x86/Kconfig            | 3 +++
 arch/x86/kernel/ftrace_64.S | 1 +
 2 files changed, 4 insertions(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 423e4b64e683..df471538a79c 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -133,6 +133,8 @@ config X86
        select HAVE_DMA_CONTIGUOUS
        select HAVE_DYNAMIC_FTRACE
        select HAVE_DYNAMIC_FTRACE_WITH_REGS
+       select DYNAMIC_FTRACE
+       select DYNAMIC_FTRACE_WITH_REGS
        select HAVE_EBPF_JIT                    if X86_64
        select HAVE_EFFICIENT_UNALIGNED_ACCESS
        select HAVE_EXIT_THREAD
@@ -140,6 +142,7 @@ config X86
        select HAVE_FTRACE_MCOUNT_RECORD
        select HAVE_FUNCTION_GRAPH_TRACER
        select HAVE_FUNCTION_TRACER
+       select FUNCTION_TRACER
        select HAVE_GCC_PLUGINS
        select HAVE_HW_BREAKPOINT
        select HAVE_IDE
diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S
index 7cb8ba08beb9..1e219e0f2887 100644
--- a/arch/x86/kernel/ftrace_64.S
+++ b/arch/x86/kernel/ftrace_64.S
@@ -19,6 +19,7 @@ EXPORT_SYMBOL(__fentry__)
 # define function_hook mcount
 EXPORT_SYMBOL(mcount)
 #endif
+       ret
 
 /* All cases save the original rbp (8 bytes) */
 #ifdef CONFIG_FRAME_POINTER

Reply via email to