On Sun, May 15, 2016 at 09:41:41AM +0900, Masami Hiramatsu wrote:

Hi Masami,

> Hi Namhyung,
> 
> I'm interested in this problem, and it seems compiling environment
> related or kconfig related problem.
> If you can reproduce this kernel, would you share what the "AS"
> commandline shows? That can be done as below;
> 
>  $ make V=1 arch/x86/kernel/mcount_64.o | grep mcount_64

  gcc -Wp,-MD,arch/x86/kernel/.mcount_64.o.d  -nostdinc -isystem 
/usr/lib/gcc/x86_64-unknown-linux-gnu/5.3.0/include 
-I/home/namhyung/project/linux/arch/x86/include 
-Iarch/x86/include/generated/uapi -Iarch/x86/include/generated  
-I/home/namhyung/project/linux/include -Iinclude 
-I/home/namhyung/project/linux/arch/x86/include/uapi 
-Iarch/x86/include/generated/uapi -I/home/namhyung/project/linux/include/uapi 
-Iinclude/generated/uapi -include 
/home/namhyung/project/linux/include/linux/kconfig.h -D__KERNEL__ 
-D__ASSEMBLY__ -m64 -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 
-DCONFIG_AS_CFI_SECTIONS=1 -DCONFIG_AS_FXSAVEQ=1 -DCONFIG_AS_SSSE3=1 
-DCONFIG_AS_CRC32=1 -DCONFIG_AS_AVX=1 -DCONFIG_AS_AVX2=1 -DCONFIG_AS_SHA1_NI=1 
-DCONFIG_AS_SHA256_NI=1 -mfentry -DCC_USING_FENTRY -DCC_HAVE_ASM_GOTO   -c -o 
arch/x86/kernel/mcount_64.o 
/home/namhyung/project/linux/arch/x86/kernel/mcount_64.S


  $ gcc --version
  gcc (GCC) 5.3.0
  Copyright (C) 2015 Free Software Foundation, Inc.
  This is free software; see the source for copying conditions.  There is NO
  warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

  $ as --version
  GNU assembler (GNU Binutils) 2.26.0.20160302
  Copyright (C) 2015 Free Software Foundation, Inc.
  This program is free software; you may redistribute it under the terms of
  the GNU General Public License version 3 or later.
  This program has absolutely no warranty.
  This assembler was configured for a target of `x86_64-pc-linux-gnu'.


Thanks,
Namhyung

> 
> Thank you,
> 
> On Fri, 13 May 2016 22:53:43 +0900
> Namhyung Kim <[email protected]> wrote:
> 
> > On my system, simply enabling and disabling function graph tracer can
> > crash the kernel.  I don't know how it worked until now.
> > 
> > The ftrace_disable_ftrace_graph_caller() modifies jmp instruction at
> > ftrace_graph_call assuming it's a 5 bytes near jmp (e9 <offset>).
> > However it's a short jmp consisting of 2 bytes only (eb <offset>).  And
> > ftrace_stub() is located just below the ftrace_graph_caller so
> > modification above breaks the instruction resulting in kernel oops on
> > the ftrace_stub() with the invalid opcode like below:
> > 
> >   # cd /sys/kernel/trace
> >   # echo function_graph > current_tracer
> >   # echo nop > current_tracer
> > 
> >   [   78.122055] invalid opcode: 0000 [#1] SMP
> >   [   78.125125] Modules linked in: x86_pkg_temp_thermal kvm_intel kvm 
> > irqbypass crc32c_intel pcspkr iwldvm iwlwifi
> >   [   78.128241] CPU: 2 PID: 17 Comm: migration/2 Not tainted 4.6.0-rc4+ #36
> >   [   78.131310] Hardware name: LENOVO 4286A74/4286A74, BIOS 8DET56WW (1.26 
> > ) 12/01/2011
> >   [   78.134369] task: ffff88040bec4240 ti: ffff88040bee4000 task.ti: 
> > ffff88040bee4000
> >   [   78.137412] RIP: 0010:[<ffffffff818939a8>]  [<ffffffff818939a8>] 
> > ftrace_stub+0x0/0x8
> >   [   78.140477] RSP: 0018:ffff88040bee7e48  EFLAGS: 00010246
> >   [   78.143489] RAX: ffff88040bec4240 RBX: ffffffff8107a7b0 RCX: 
> > 000000000000001f
> >   [   78.146512] RDX: 0000000000000000 RSI: ffff88041e2929d8 RDI: 
> > ffff88040bee7e50
> >   [   78.149581] RBP: ffff88040bee7e80 R08: ffff88040bee4000 R09: 
> > 0000000000000000
> >   [   78.152647] R10: 00000000000318b7 R11: ffff8800d661f800 R12: 
> > ffff88040d8011b0
> >   [   78.155679] R13: ffffffff81e43620 R14: ffff88040bda8588 R15: 
> > ffffffff81e503e0
> >   [   78.158675] FS:  0000000000000000(0000) GS:ffff88041e280000(0000) 
> > knlGS:0000000000000000
> >   [   78.161699] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >   [   78.164690] CR2: 00007fadb22dde1d CR3: 00000000d5ce2000 CR4: 
> > 00000000000406e0
> >   [   78.167691] Stack:
> >   [   78.170658]  ffffffff8110b3ee ffffffff8188de90 00000009161d55f6 
> > 00000012306fc2e4
> >   [   78.173710]  0000000000000000 ffff880400000000 ffff88040bec4240 
> > ffff88040bee7ec8
> >   [   78.176783]  ffffffff81893bbd 0000000000000000 ffff88040bec4240 
> > ffffffff81893ba8
> >   [   78.179863] Call Trace:
> >   [   78.182853]  [<ffffffff8110b3ee>] ? ftrace_return_to_handler+0x8e/0x100
> >   [   78.185909]  [<ffffffff8188de90>] ? __schedule+0xae0/0xae0
> >   [   78.188941]  [<ffffffff81893bbd>] return_to_handler+0x15/0x27
> >   [   78.192001]  [<ffffffff81893ba8>] ? ftrace_graph_caller+0xa8/0xa8
> >   [   78.195091]  [<ffffffff8107a6f0>] ? sort_range+0x30/0x30
> >   [   78.198138]  [<ffffffff810778a9>] kthread+0xc9/0xe0
> >   [   78.201143]  [<ffffffff81891a12>] ret_from_fork+0x22/0x40
> >   [   78.204138]  [<ffffffff810777e0>] ? kthread_worker_fn+0x170/0x170
> >   [   78.207129] Code: 8b 44 24 48 48 8b 7c 24 70 48 8b 74 24 68 48 8b 54 
> > 24 60
> >                        48 8b 4c 24 58 48 8b 44 24 50 48 8b 6c 24 20 48 81 
> > c4 d0
> >                        00 00 00 e9 fd <ff> ff ff 80 00 00 00 00 9c 55 ff 74 
> > 24 18
> >                        55 48 89 e5 ff 74 24
> >   [   78.213997] RIP  [<ffffffff818939a8>] ftrace_stub+0x0/0x8
> >   [   78.217190]  RSP <ffff88040bee7e48>
> >   [   78.220374] ---[ end trace 0af2f0d9f7301011 ]---
> > 
> > Looking at the code dump, the ftrace_stub instruction was overwritten as
> > <ff>.  Below is disassembly output of related code.
> > 
> >   $ objdump -d --start-address=0xffffffff818939a6 
> > --stop-address=0xffffffff818939b0 vmlinux
> > 
> >   vmlinux:     file format elf64-x86-64
> > 
> >   Disassembly of section .text:
> > 
> >   ffffffff818939a6 <ftrace_epilogue>:
> >   ffffffff818939a6: eb 00                   jmp    ffffffff818939a8 
> > <ftrace_stub>
> > 
> >   ffffffff818939a8 <ftrace_stub>:
> >   ffffffff818939a8: c3                      retq
> >   ffffffff818939a9: 0f 1f 80 00 00 00 00    nopl   0x0(%rax)
> > 
> > As you can see ftrace_epilogue (same as ftrace_graph_caller) is 2 byte
> > ahead of ftrace_stub.  And it's replaced by a jump to ftrace_stub() by
> > ftrace_disable_ftrace_graph_caller: "e9 <0xfffffffd>".  Pads 3 bytes
> > after ftrace_epilogue to prevent ftrace_stub from being overwritten.
> > 
> > Signed-off-by: Namhyung Kim <[email protected]>
> > ---
> >  arch/x86/kernel/mcount_64.S | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> > 
> > diff --git a/arch/x86/kernel/mcount_64.S b/arch/x86/kernel/mcount_64.S
> > index ed48a9f465f8..0e6af57a713a 100644
> > --- a/arch/x86/kernel/mcount_64.S
> > +++ b/arch/x86/kernel/mcount_64.S
> > @@ -180,6 +180,15 @@ GLOBAL(ftrace_epilogue)
> >  #ifdef CONFIG_FUNCTION_GRAPH_TRACER
> >  GLOBAL(ftrace_graph_call)
> >     jmp ftrace_stub
> > +   /*
> > +    * The above jmp is generated as a short jump which occupies 2 bytes
> > +    * but ftrace_enable/disable_ftrace_graph_caller() assumes it's a
> > +    * near jump which occupies 5 bytes so breaks ftrace_stub() below.
> > +    * Add 3 bytes padding to avoid that.
> > +    */
> > +   nop
> > +   nop
> > +   nop
> >  #endif
> >  
> >  GLOBAL(ftrace_stub)
> > -- 
> > 2.8.0
> > 
> 
> 
> -- 
> Masami Hiramatsu <[email protected]>

Reply via email to