On Wed, Oct 22, 2025 at 05:04:29PM +0800, Feng Yang wrote:
> On Wed, 15 Oct 2025 12:11:38 -0400 Steven Rostedt <[email protected]> wrote:
>
> > > > Hmm, we do have a way to retrieve the actual return caller from a
> > > > location
> > > > for return_to_handler:
> > > >
> > > > See kernel/trace/fgraph.c: ftrace_graph_get_ret_stack()
> > > >
> > > > Hmm, I think the x86 ORC unwinder needs to use this.
> > >
> > > I'm confused, is that not what ftrace_graph_ret_addr() already does?
>
> > Ah yeah, that does it too. I just searched for the first function that did
> > the look up ;-)
>
> > Now I guess the question is, why is this not working?
>
>
> I've also encountered this issue recently. It only outputs the stack trace of
> return_to_handler, for example:
>
> # bpftrace -e 'kretprobe:vfs_rea* {@[kstack]=count()}'
> Attaching 1 probe...
> ^C
>
> @[
> ksys_read+192
> get_perf_callchain+211
> bpf_get_stackid+101
> cleanup_module+303100
> kprobe_multi_link_prog_run+175
> fprobe_return+265
> __ftrace_return_to_handler.isra.0+433
> return_to_handler+30
> ]: 1
that looks messed up
>
> The return stack trace when directly executing
> samples/fprobe/fprobe_example.c is similar:
> [ 71.892353] return_to_handler: kernel_thread+0x71/0xa0
> [ 71.892356] sample_exit_handler: Return from <kernel_clone+0x4/0x470> ip =
> 0x000000000e0e2004 to rip = 0x00000000127e6d58 (kernel_thread+0x71/0xa0)
> [ 71.892361] __ftrace_return_to_handler.isra.0+0x1b1/0x280
> [ 71.892363] return_to_handler+0x1e/0x50
>
> No cases were found where the ret of the ftrace_graph_ret_addr function is
> equal to return_handler.
>
> Additionally, I noticed that when the x86 architecture executes
> perf_callchain_kernel, perf_hw_regs(regs) is false,
> and it calls unwind_start(&state, current, NULL, (void *)regs->sp);
> which then proceeds to __unwind_start where the check task == current is
> performed.
> However, the ARM architecture executes kunwind_init_from_regs(&state, regs);
> instead of taking the second branch with the task == current check.
>
> I hope these phenomena can help you analyze the cause of this issue.
> Thanks.
>
thanks for the report.. so above is from arm?
yes the x86_64 starts with:
unwind_start(&state, current, NULL, (void *)regs->sp);
I seems to get reasonable stack traces on x86 with the change below,
which just initializes fields in regs that are used later on and sets
the stack so the ftrace_graph_ret_addr code is triggered during unwind
but I'm not familiar with this code, Masami, Josh, any idea?
thanks,
jirka
---
diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S
index 367da3638167..2d2bb8c37b56 100644
--- a/arch/x86/kernel/ftrace_64.S
+++ b/arch/x86/kernel/ftrace_64.S
@@ -353,6 +353,8 @@ STACK_FRAME_NON_STANDARD_FP(__fentry__)
SYM_CODE_START(return_to_handler)
UNWIND_HINT_UNDEFINED
ANNOTATE_NOENDBR
+ push $return_to_handler
+ UNWIND_HINT_FUNC
/* Save ftrace_regs for function exit context */
subq $(FRAME_SIZE), %rsp
@@ -360,6 +362,9 @@ SYM_CODE_START(return_to_handler)
movq %rax, RAX(%rsp)
movq %rdx, RDX(%rsp)
movq %rbp, RBP(%rsp)
+ movq %rsp, RSP(%rsp)
+ movq $0, EFLAGS(%rsp)
+ movq $__KERNEL_CS, CS(%rsp)
movq %rsp, %rdi
call ftrace_return_to_handler
@@ -368,7 +373,8 @@ SYM_CODE_START(return_to_handler)
movq RDX(%rsp), %rdx
movq RAX(%rsp), %rax
- addq $(FRAME_SIZE), %rsp
+ addq $(FRAME_SIZE) + 8, %rsp
+
/*
* Jump back to the old return address. This cannot be JMP_NOSPEC rdi
* since IBT would demand that contain ENDBR, which simply isn't so for