Sashiko raised a question about pidfd_get_task() and PIDFD_THREAD [1], so I ran some tests to understand the behavior. [1] https://sashiko.dev/#/patchset/[email protected]
pidfd_get_task() always resolves pidfds using PIDTYPE_TGID (kernel/pid.c line 640), regardless of whether the pidfd was created with PIDFD_THREAD. This means: - A PIDFD_THREAD pidfd for a non-leader thread fails with ESRCH. - A regular pidfd for a process whose leader has exited (pthread_exit in main, secondary thread still alive) also fails with ESRCH. This is not specific to my patch: process_madvise() uses pidfd_get_task() in the same way and has the same behavior. I wrote a test program confirming this: https://github.com/alban/tests/tree/alban_pvm_flags/pvm_flags/pidfd_thread_test Results summary: All threads alive: pidfd_open(pid, 0) + process_vm_readv: OK pidfd_open(tid, PIDFD_THREAD) + process_vm_readv: OK (leader tid) pidfd_open(tid, PIDFD_THREAD) + process_vm_readv: ESRCH (non-leader) Leader thread exited (secondary still alive): pidfd_open(pid, 0) + process_vm_readv: ESRCH pidfd_open(pid, PIDFD_THREAD) + process_vm_readv: ESRCH pidfd_open(tid, PIDFD_THREAD) + process_vm_readv: ESRCH (non-leader) process_vm_readv(tid, flags=0) : OK (plain TID path) process_madvise() behaves identically in all cases above. For the non-leader thread case when all threads are alive, this is fine in practice: all threads share the same mm_struct, so profilers just use a regular pidfd for the thread-group leader. However, the exited-leader case is a real limitation for profilers. OpenTelemetry eBPF Profiler wants to profile a process where the main thread has exited but secondary threads are still running [2]. [2] https://github.com/open-telemetry/opentelemetry-ebpf-profiler/pull/376 Using plain TIDs (flags=0) would work, but it means users cannot use PROCESS_VM_PIDFD in this scenario. What do you think this patch should do? I see two options: - Address this limitation in a separate future patch that fixes pidfd_get_task() to use PIDTYPE_PID when PIDFD_THREAD is detected in f_flags, benefiting all callers (process_vm_readv, process_madvise, and any future users). - Address it in this patch series.

