On Tue, May 12, 2026 at 10:55:28AM +0200, Jens Remus wrote:
> On 5/12/2026 5:00 AM, Dylan Hatch wrote:
> > On Fri, May 1, 2026 at 9:46 AM Mark Rutland <[email protected]>
> > wrote:
> 
> >> (1) For correctness, we'll need to address a latent issue with
> >> unwinding across an fgraph return trampoline, where the return
> >> address is transiently unrecoverable.

> >> I think we can solve that with some restructuring of that code, 
> >> restoring the original address *before* removing that from the 
> >> fgraph return stack, and ensuring that the unwinder can find it.
> > 
> > If my understanding is correct, the issue arrises in
> > return_to_handler as the return address is recovered:
> > 
> > mov x0, sp bl ftrace_return_to_handler // addr =
> > ftrace_return_to_hander(fregs); mov x30, x0 // restore the original
> > return address
> > 
> > Because ftrace_return_to_handler pops the return address from the 
> > return stack before it can be restored into the LR, it cannot be 
> > recovered.
> 
> Based on reliable-stacktrace.rst section "4.4 Rewriting of return
> addresses" I wonder whether the following might work:
> 
> - If an unwound RA points at return_to_handler the actual RA needs to
>   be obtained using ftrace_graph_ret_addr().  This might already be
>   taken into account if ftrace_graph_ret_addr() is used unconditionally.
> 
> - If an unwound RA points into return_to_handler() mark the stack trace
>   as unreliable.  This could be accomplished by marking LR in
>   return_to_handler() as undefined (i.e. .cfi_undefined 30) to use
>   SFrame's outermost frame indication to stop and mark the stack trace
>   as unreliable:

We don't currently have any CFI annotations for return_to_handler(), so
if we interrupt that, any unwind will naturally be marked as unreliable.

The problem is that we can try an unwind from an interrupted *callee* of
return_to_handler(). In that case, we'll unwind through
return_to_handler() using the frame pointer, without consulting SFrame.
In that case, the PC will be part-way through return_to_handler(), but
we only call ftrace_graph_ret_addr() when the PC is the start of
return_to_handler, and so we don't even try to recover the return
address.

We can handle that better by checking whether the PC is *within*
return_to_handler(), and aborting when the original return address
cannot be recoverted. I'm happy to go put that together, nad longer term
I would like to do the better reovery I described above such that we can
*always* recover the return address.

Mark.

Reply via email to