Matt,

This bug looks very similar to what you were hitting with the function
profiler. Can you apply this patch and see if it fixes the issue for
you.

Thanks!

-- Steve


On Fri, 13 May 2016 22:53:43 +0900
Namhyung Kim <[email protected]> wrote:

> On my system, simply enabling and disabling function graph tracer can
> crash the kernel.  I don't know how it worked until now.
> 
> The ftrace_disable_ftrace_graph_caller() modifies jmp instruction at
> ftrace_graph_call assuming it's a 5 bytes near jmp (e9 <offset>).
> However it's a short jmp consisting of 2 bytes only (eb <offset>).  And
> ftrace_stub() is located just below the ftrace_graph_caller so
> modification above breaks the instruction resulting in kernel oops on
> the ftrace_stub() with the invalid opcode like below:
> 
>   # cd /sys/kernel/trace
>   # echo function_graph > current_tracer
>   # echo nop > current_tracer
> 
>   [   78.122055] invalid opcode: 0000 [#1] SMP
>   [   78.125125] Modules linked in: x86_pkg_temp_thermal kvm_intel kvm 
> irqbypass crc32c_intel pcspkr iwldvm iwlwifi
>   [   78.128241] CPU: 2 PID: 17 Comm: migration/2 Not tainted 4.6.0-rc4+ #36
>   [   78.131310] Hardware name: LENOVO 4286A74/4286A74, BIOS 8DET56WW (1.26 ) 
> 12/01/2011
>   [   78.134369] task: ffff88040bec4240 ti: ffff88040bee4000 task.ti: 
> ffff88040bee4000
>   [   78.137412] RIP: 0010:[<ffffffff818939a8>]  [<ffffffff818939a8>] 
> ftrace_stub+0x0/0x8
>   [   78.140477] RSP: 0018:ffff88040bee7e48  EFLAGS: 00010246
>   [   78.143489] RAX: ffff88040bec4240 RBX: ffffffff8107a7b0 RCX: 
> 000000000000001f
>   [   78.146512] RDX: 0000000000000000 RSI: ffff88041e2929d8 RDI: 
> ffff88040bee7e50
>   [   78.149581] RBP: ffff88040bee7e80 R08: ffff88040bee4000 R09: 
> 0000000000000000
>   [   78.152647] R10: 00000000000318b7 R11: ffff8800d661f800 R12: 
> ffff88040d8011b0
>   [   78.155679] R13: ffffffff81e43620 R14: ffff88040bda8588 R15: 
> ffffffff81e503e0
>   [   78.158675] FS:  0000000000000000(0000) GS:ffff88041e280000(0000) 
> knlGS:0000000000000000
>   [   78.161699] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   [   78.164690] CR2: 00007fadb22dde1d CR3: 00000000d5ce2000 CR4: 
> 00000000000406e0
>   [   78.167691] Stack:
>   [   78.170658]  ffffffff8110b3ee ffffffff8188de90 00000009161d55f6 
> 00000012306fc2e4
>   [   78.173710]  0000000000000000 ffff880400000000 ffff88040bec4240 
> ffff88040bee7ec8
>   [   78.176783]  ffffffff81893bbd 0000000000000000 ffff88040bec4240 
> ffffffff81893ba8
>   [   78.179863] Call Trace:
>   [   78.182853]  [<ffffffff8110b3ee>] ? ftrace_return_to_handler+0x8e/0x100
>   [   78.185909]  [<ffffffff8188de90>] ? __schedule+0xae0/0xae0
>   [   78.188941]  [<ffffffff81893bbd>] return_to_handler+0x15/0x27
>   [   78.192001]  [<ffffffff81893ba8>] ? ftrace_graph_caller+0xa8/0xa8
>   [   78.195091]  [<ffffffff8107a6f0>] ? sort_range+0x30/0x30
>   [   78.198138]  [<ffffffff810778a9>] kthread+0xc9/0xe0
>   [   78.201143]  [<ffffffff81891a12>] ret_from_fork+0x22/0x40
>   [   78.204138]  [<ffffffff810777e0>] ? kthread_worker_fn+0x170/0x170
>   [   78.207129] Code: 8b 44 24 48 48 8b 7c 24 70 48 8b 74 24 68 48 8b 54 24 
> 60
>                        48 8b 4c 24 58 48 8b 44 24 50 48 8b 6c 24 20 48 81 c4 
> d0
>                        00 00 00 e9 fd <ff> ff ff 80 00 00 00 00 9c 55 ff 74 
> 24 18
>                        55 48 89 e5 ff 74 24
>   [   78.213997] RIP  [<ffffffff818939a8>] ftrace_stub+0x0/0x8
>   [   78.217190]  RSP <ffff88040bee7e48>
>   [   78.220374] ---[ end trace 0af2f0d9f7301011 ]---
> 
> Looking at the code dump, the ftrace_stub instruction was overwritten as
> <ff>.  Below is disassembly output of related code.
> 
>   $ objdump -d --start-address=0xffffffff818939a6 
> --stop-address=0xffffffff818939b0 vmlinux
> 
>   vmlinux:     file format elf64-x86-64
> 
>   Disassembly of section .text:
> 
>   ffffffff818939a6 <ftrace_epilogue>:
>   ffffffff818939a6:   eb 00                   jmp    ffffffff818939a8 
> <ftrace_stub>
> 
>   ffffffff818939a8 <ftrace_stub>:
>   ffffffff818939a8:   c3                      retq
>   ffffffff818939a9:   0f 1f 80 00 00 00 00    nopl   0x0(%rax)
> 
> As you can see ftrace_epilogue (same as ftrace_graph_caller) is 2 byte
> ahead of ftrace_stub.  And it's replaced by a jump to ftrace_stub() by
> ftrace_disable_ftrace_graph_caller: "e9 <0xfffffffd>".  Pads 3 bytes
> after ftrace_epilogue to prevent ftrace_stub from being overwritten.
> 
> Signed-off-by: Namhyung Kim <[email protected]>
> ---
>  arch/x86/kernel/mcount_64.S | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/arch/x86/kernel/mcount_64.S b/arch/x86/kernel/mcount_64.S
> index ed48a9f465f8..0e6af57a713a 100644
> --- a/arch/x86/kernel/mcount_64.S
> +++ b/arch/x86/kernel/mcount_64.S
> @@ -180,6 +180,15 @@ GLOBAL(ftrace_epilogue)
>  #ifdef CONFIG_FUNCTION_GRAPH_TRACER
>  GLOBAL(ftrace_graph_call)
>       jmp ftrace_stub
> +     /*
> +      * The above jmp is generated as a short jump which occupies 2 bytes
> +      * but ftrace_enable/disable_ftrace_graph_caller() assumes it's a
> +      * near jump which occupies 5 bytes so breaks ftrace_stub() below.
> +      * Add 3 bytes padding to avoid that.
> +      */
> +     nop
> +     nop
> +     nop
>  #endif
>  
>  GLOBAL(ftrace_stub)

Reply via email to