This is a patch submission that addresses the issues first described at 
https://lists.debian.org/debian-hurd/2025/07/msg00046.html.

The following summarises a scenario where task termination deadlock occurs:

1) stress-ng (parent process) kills (SIGKILL) stress-ng (child process). This 
results in task_terminate(child) invoked by parent task.
2) Meanwhile child process is handling SIGALRM (in thread1) which suspends 
thread0 maintaining a thread reference to thread0.
3) child process thread1 is held (thread_hold called by parent) which 
ultimately results in thread1 having TH_SUSP state.
4) The parent process tries to terminate child threads in sequence. Thread0 
cannot be completely terminated because thread1 maintains a reference to 
thread0.
5) Now parent process stuck repeatedly trying to terminate thread0 in a 
continuous loop.

I've altered task termination so that it doesn't have to wait for the head 
thread to terminate before attempting to terminate the remainder. This way 
thread1 now gets a chance to terminate, releases the reference to thread0, 
which can terminate successfully on the next pass.

I've been running this patched version for many months without obvious negative 
side effect.

Regards,

Mike.

Reply via email to