On Wed, 21 May 2025 08:26:05 +0900
Masami Hiramatsu (Google) <mhira...@kernel.org> wrote:

> > Maybe I asked this before but I don't remember if I got the answer. :)
> > How does it handle task exits as it won't go to userspace?  I guess it'll
> > lose user callstacks for exit syscalls and other termination paths.

I just checked, and the good news is that task_work does indeed get called
when a task exits. The bad news is that it happens after do_exit() cleans
up the task's "mm" structure via exit_mm(). Which means that current->mm is
NULL :-p

There's a proposal to move trace_sched_process_exit() to before exit_mm().
If that happens, we could make that tracepoint a "faultable" tracepoint and
then the unwind infrastructure could attach to it and do the unwinding from
that tracepoint.

> > 
> > Similarly, it will miss user callstacks in the samples at the end of
> > profiling if the target tasks remain in the kernel (or they sleep).
> > It looks like a fundamental limitation of the deferred callchains.  

Yes that is a limitation.

> 
> Can we use a hybrid approach for this case?
> It might be more balanced (from the performance point of view) to save
> the full stack in a classic way only in this case, rather than faulting
> on process exit or doing file access just to load the sframe.

Another approach is that the tool (like perf) could request to take the
user space stack trace every time a task enters the kernel via a system
call.

-- Steve

Reply via email to