On Tue, 9 Jun 2026 19:12:41 +0800
Tengda Wu <[email protected]> wrote:
>
>
> On 2026/6/9 17:43, Petr Mladek wrote:
> > Added live-patching mailing list.
> >
> > On Tue 2026-06-09 16:49:53, Tengda Wu wrote:
> >> The current check in rethook_find_ret_addr() prevents obtaining a return
> >> address when the target task is marked as running. However, this condition
> >> is both insufficient for correctness and unnecessary for its intended
> >> purpose.
> >>
> >> The check is inherently racy: a task can begin running on another CPU
> >> immediately after task_is_running() returns false, potentially leading to
> >> concurrent modification of rethook data structures while the iteration is
> >> in progress.
> >>
> >> Rather than trying to fix this unreliable check deep in the unwinding
> >> path, simply remove it. The iteration is already safe from crashes because
> >> unwind_next_frame() holds RCU and rethook_node structures are RCU-freed;
> >> even if the iteration goes off the rails and returns invalid information,
> >> it will not crash. Callers that require consistency must provide a safe
> >> context themselves.
> >>
> >> Fixes: 54ecbe6f1ed5 ("rethook: Add a generic return hook")
> >> Acked-by: Peter Zijlstra (Intel) <[email protected]>
> >> Signed-off-by: Tengda Wu <[email protected]>
> >> ---
> >> v3: Improve commit message: clarify safety semantics and document that RCU
> >> guarantees no crash.
> >> v2:
> >> https://lore.kernel.org/all/[email protected]/
> >> v1:
> >> https://lore.kernel.org/all/[email protected]/
> >>
> >> --- a/kernel/trace/rethook.c
> >> +++ b/kernel/trace/rethook.c
> >> @@ -250,9 +250,6 @@ unsigned long rethook_find_ret_addr(struct task_struct
> >> *tsk, unsigned long frame
> >> if (WARN_ON_ONCE(!cur))
> >> return 0;
> >>
> >> - if (tsk != current && task_is_running(tsk))
> >> - return 0;
> >> -
> >
> > The description of the function should be updated as well. It still
> > mentions:
> >
> > * The @tsk must be 'current' or a task which is not running.
> >
> > Instead it should explain that it safe to call the function even
> > on another running tasks but the returned address is not reliable
> > then.
> >
>
> Oh, I forgot that. Thanks for pointing it out.
Yeah, but it should be updated to explain what you need to do.
For example call it should hold RCU, or use for current.
Thanks,
--
Masami Hiramatsu (Google) <[email protected]>