On Tue, May 12, 2026 at 10:55:28AM +0200, Jens Remus wrote: > On 5/12/2026 5:00 AM, Dylan Hatch wrote: > > On Fri, May 1, 2026 at 9:46 AM Mark Rutland <[email protected]> > > wrote: > > >> (1) For correctness, we'll need to address a latent issue with > >> unwinding across an fgraph return trampoline, where the return > >> address is transiently unrecoverable.
> >> I think we can solve that with some restructuring of that code, > >> restoring the original address *before* removing that from the > >> fgraph return stack, and ensuring that the unwinder can find it. > > > > If my understanding is correct, the issue arrises in > > return_to_handler as the return address is recovered: > > > > mov x0, sp bl ftrace_return_to_handler // addr = > > ftrace_return_to_hander(fregs); mov x30, x0 // restore the original > > return address > > > > Because ftrace_return_to_handler pops the return address from the > > return stack before it can be restored into the LR, it cannot be > > recovered. > > Based on reliable-stacktrace.rst section "4.4 Rewriting of return > addresses" I wonder whether the following might work: > > - If an unwound RA points at return_to_handler the actual RA needs to > be obtained using ftrace_graph_ret_addr(). This might already be > taken into account if ftrace_graph_ret_addr() is used unconditionally. > > - If an unwound RA points into return_to_handler() mark the stack trace > as unreliable. This could be accomplished by marking LR in > return_to_handler() as undefined (i.e. .cfi_undefined 30) to use > SFrame's outermost frame indication to stop and mark the stack trace > as unreliable: We don't currently have any CFI annotations for return_to_handler(), so if we interrupt that, any unwind will naturally be marked as unreliable. The problem is that we can try an unwind from an interrupted *callee* of return_to_handler(). In that case, we'll unwind through return_to_handler() using the frame pointer, without consulting SFrame. In that case, the PC will be part-way through return_to_handler(), but we only call ftrace_graph_ret_addr() when the PC is the start of return_to_handler, and so we don't even try to recover the return address. We can handle that better by checking whether the PC is *within* return_to_handler(), and aborting when the original return address cannot be recoverted. I'm happy to go put that together, nad longer term I would like to do the better reovery I described above such that we can *always* recover the return address. Mark.

