On 10/08, Oleg Nesterov wrote:
>
> and execing thread waits for execing thread. Deadlock.
      ^^^^^^^
Sorry, typo in the changelog....

------------------------------------------------------------------------------
[PATCH 1/1] exec: make de_thread() killable

Change de_thread() to use KILLABLE rather than UNINTERRUPTIBLE
while waiting for other threads. The only complication is that
we should clear ->group_exit_task and ->notify_count before we
return, and we should do this under tasklist_lock. -EAGAIN is
used to match the initial signal_group_exit() check/return, it
doesn't really matter.

This fixes the (unlikely) race with coredump. de_thread() checks
signal_group_exit() before it starts to kill the subthreads, but
this can't help if another CLONE_VM (but non CLONE_THREAD) task
starts the coredumping after de_thread() unlocks ->siglock. In
this case the killed sub-thread can block in exit_mm() waiting
for coredump_finish(), execing thread waits for that sub-thead,
and the coredumping thread waits for execing thread. Deadlock.

Signed-off-by: Oleg Nesterov <[email protected]>
---
 fs/exec.c |   16 ++++++++++++++--
 1 files changed, 14 insertions(+), 2 deletions(-)

--- a/fs/exec.c
+++ b/fs/exec.c
@@ -878,9 +878,11 @@ static int de_thread(struct task_struct 
                sig->notify_count--;
 
        while (sig->notify_count) {
-               __set_current_state(TASK_UNINTERRUPTIBLE);
+               __set_current_state(TASK_KILLABLE);
                spin_unlock_irq(lock);
                schedule();
+               if (unlikely(__fatal_signal_pending(tsk)))
+                       goto killed;
                spin_lock_irq(lock);
        }
        spin_unlock_irq(lock);
@@ -898,9 +900,11 @@ static int de_thread(struct task_struct 
                        write_lock_irq(&tasklist_lock);
                        if (likely(leader->exit_state))
                                break;
-                       __set_current_state(TASK_UNINTERRUPTIBLE);
+                       __set_current_state(TASK_KILLABLE);
                        write_unlock_irq(&tasklist_lock);
                        schedule();
+                       if (unlikely(__fatal_signal_pending(tsk)))
+                               goto killed;
                }
 
                /*
@@ -994,6 +998,14 @@ no_thread_group:
 
        BUG_ON(!thread_group_leader(tsk));
        return 0;
+
+killed:
+       /* protects against exit_notify() and __exit_signal() */
+       read_lock(&tasklist_lock);
+       sig->group_exit_task = NULL;
+       sig->notify_count = 0;
+       read_unlock(&tasklist_lock);
+       return -EAGAIN;
 }
 
 char *get_task_comm(char *buf, struct task_struct *tsk)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to