On Mon, Jan 08, 2024 at 12:25:55PM +0000, Mark Rutland wrote:
> We also have HAVE_FUNCTION_GRAPH_RET_ADDR_PTR, but since the return address is
> not on the stack at the point function-entry is intercepted we use the FP as
> the retp value -- in the absence of tail calls this will be different between 
> a
> caller and callee.

Ah; I just spotted that this patch changed that in ftrace_graph_func(), which
is the source of the bug. 

As of this patch, we use the address of fregs->lr as the retp value, but the
unwinder still uses the FP value, and so when unwind_recover_return_address()
calls ftrace_graph_ret_addr(), the retp value won't match the expected entry on
the fgraph ret_stack, resulting in failing to find the expected entry.

Since the ftrace_regs only exist transiently during function entry/exit, it's
possible for a stackframe to reuse that same address on the stack, which would
result in finding a different entry by mistake.

The diff below restores the existing behaviour and fixes the issue for me.
Could you please fold that into this patch?

On a separate note, looking at how this patch changed arm64's
ftrace_graph_func(), do we need similar changes to arm64's
prepare_ftrace_return() for the old-style mcount based ftrace?

Mark.

---->8----
diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
index 205937e04ece..329092ce06ba 100644
--- a/arch/arm64/kernel/ftrace.c
+++ b/arch/arm64/kernel/ftrace.c
@@ -495,7 +495,7 @@ void ftrace_graph_func(unsigned long ip, unsigned long 
parent_ip,
        if (bit < 0)
                return;
 
-       if (!function_graph_enter_ops(*parent, ip, fregs->fp, parent, gops))
+       if (!function_graph_enter_ops(*parent, ip, fregs->fp, (void 
*)fregs->fp, gops))
                *parent = (unsigned long)&return_to_handler;
 
        ftrace_test_recursion_unlock(bit);

Reply via email to