Re: [PATCH] fix group stop with exit race

2007-12-06 Thread Oleg Nesterov
On 12/05, Davide Libenzi wrote:
>
> On Wed, 5 Dec 2007, Oleg Nesterov wrote:
>
> > do_signal_stop() counts all sub-thread and sets ->group_stop_count 
> > accordingly.
> > Every thread should decrement ->group_stop_count and stop, the last one 
> > should
> > notify the parent.
> >
> > However a sub-thread can exit before it notices the signal_pending(), or it 
> > may
> > be somewhere in do_exit() already. In that case the group stop never 
> > finishes
> > properly.
> >
> > Note: this is a minimal fix, we can add some optimizations later. Say we can
> > return quickly if thread_group_empty(). Also, we can move some signal 
> > related
> > code from exit_notify() to exit_signals().
> >
> > Signed-off-by: Oleg Nesterov <[EMAIL PROTECTED]>
>
> Looks OK for me, even though we're doing more work on the exit path. OTOH
> I don't see a non-racy way of doing it w/out grabbing the lock. Did you
> try to bench how much this change costs?

Yes, you are right, this unconditional spin_lock() is not good, especially for
exit_group/exec.

But please look at the next patch I am sending, it removes the pessimization
almost completely.

The only difference: when there is no group exit in progress, we are doing

spin_lock_irq(siglock);
if (!signal_pending()) {
unlock and return
}

while the current code does

if (!signal_pending())
return;
spin_lock_irq(siglock);
...

It would be nice to measure the difference, but I can't invent the test-case.

I tested (just in case) 10 fork+exit 's

perl -e 'fork ? wait : exit for 1 .. 100_000'

with and without the patch, and didn't notice any difference as expected.

> Acked-by: Davide Libenzi <[EMAIL PROTECTED]>

Thanks for looking at this!

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] fix group stop with exit race

2007-12-06 Thread Oleg Nesterov
On 12/05, Davide Libenzi wrote:

 On Wed, 5 Dec 2007, Oleg Nesterov wrote:

  do_signal_stop() counts all sub-thread and sets -group_stop_count 
  accordingly.
  Every thread should decrement -group_stop_count and stop, the last one 
  should
  notify the parent.
 
  However a sub-thread can exit before it notices the signal_pending(), or it 
  may
  be somewhere in do_exit() already. In that case the group stop never 
  finishes
  properly.
 
  Note: this is a minimal fix, we can add some optimizations later. Say we can
  return quickly if thread_group_empty(). Also, we can move some signal 
  related
  code from exit_notify() to exit_signals().
 
  Signed-off-by: Oleg Nesterov [EMAIL PROTECTED]

 Looks OK for me, even though we're doing more work on the exit path. OTOH
 I don't see a non-racy way of doing it w/out grabbing the lock. Did you
 try to bench how much this change costs?

Yes, you are right, this unconditional spin_lock() is not good, especially for
exit_group/exec.

But please look at the next patch I am sending, it removes the pessimization
almost completely.

The only difference: when there is no group exit in progress, we are doing

spin_lock_irq(siglock);
if (!signal_pending()) {
unlock and return
}

while the current code does

if (!signal_pending())
return;
spin_lock_irq(siglock);
...

It would be nice to measure the difference, but I can't invent the test-case.

I tested (just in case) 10 fork+exit 's

perl -e 'fork ? wait : exit for 1 .. 100_000'

with and without the patch, and didn't notice any difference as expected.

 Acked-by: Davide Libenzi [EMAIL PROTECTED]

Thanks for looking at this!

Oleg.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] fix group stop with exit race

2007-12-05 Thread Davide Libenzi
On Wed, 5 Dec 2007, Oleg Nesterov wrote:

> do_signal_stop() counts all sub-thread and sets ->group_stop_count 
> accordingly.
> Every thread should decrement ->group_stop_count and stop, the last one should
> notify the parent.
> 
> However a sub-thread can exit before it notices the signal_pending(), or it 
> may
> be somewhere in do_exit() already. In that case the group stop never finishes
> properly.
> 
> Note: this is a minimal fix, we can add some optimizations later. Say we can
> return quickly if thread_group_empty(). Also, we can move some signal related
> code from exit_notify() to exit_signals().
> 
> Signed-off-by: Oleg Nesterov <[EMAIL PROTECTED]>

Looks OK for me, even though we're doing more work on the exit path. OTOH 
I don't see a non-racy way of doing it w/out grabbing the lock. Did you 
try to bench how much this change costs?
Anyway, looks sane to me...

Acked-by: Davide Libenzi <[EMAIL PROTECTED]>



- Davide


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] fix group stop with exit race

2007-12-05 Thread Oleg Nesterov
do_signal_stop() counts all sub-thread and sets ->group_stop_count accordingly.
Every thread should decrement ->group_stop_count and stop, the last one should
notify the parent.

However a sub-thread can exit before it notices the signal_pending(), or it may
be somewhere in do_exit() already. In that case the group stop never finishes
properly.

Note: this is a minimal fix, we can add some optimizations later. Say we can
return quickly if thread_group_empty(). Also, we can move some signal related
code from exit_notify() to exit_signals().

Signed-off-by: Oleg Nesterov <[EMAIL PROTECTED]>

--- PT/include/linux/signal.h~1_stop_exit   2007-11-20 17:16:10.0 
+0300
+++ PT/include/linux/signal.h   2007-12-05 19:49:14.0 +0300
@@ -241,6 +241,7 @@ extern int show_unhandled_signals;
 
 struct pt_regs;
 extern int get_signal_to_deliver(siginfo_t *info, struct k_sigaction 
*return_ka, struct pt_regs *regs, void *cookie);
+extern void exit_signals(struct task_struct *tsk);
 
 extern struct kmem_cache *sighand_cachep;
 
--- PT/kernel/signal.c~1_stop_exit  2007-12-02 17:05:15.0 +0300
+++ PT/kernel/signal.c  2007-12-05 20:17:20.0 +0300
@@ -1707,7 +1707,7 @@ static int do_signal_stop(int signr)
 * stop is always done with the siglock held,
 * so this check has no races.
 */
-   if (!t->exit_state &&
+   if (!(t->flags & PF_EXITING) &&
!is_task_stopped_or_traced(t)) {
stop_count++;
signal_wake_up(t, 0);
@@ -1868,6 +1868,31 @@ relock:
return signr;
 }
 
+void exit_signals(struct task_struct *tsk)
+{
+   int group_stop = 0;
+
+   spin_lock_irq(>sighand->siglock);
+   if (unlikely(tsk->signal->group_stop_count) &&
+   !--tsk->signal->group_stop_count) {
+   tsk->signal->flags = SIGNAL_STOP_STOPPED;
+   group_stop = 1;
+   }
+
+   /*
+* From now this task is not visible for group-wide signals,
+* see wants_signal(), do_signal_stop().
+*/
+   tsk->flags |= PF_EXITING;
+   spin_unlock_irq(>sighand->siglock);
+
+   if (unlikely(group_stop)) {
+   read_lock(_lock);
+   do_notify_parent_cldstop(tsk, CLD_STOPPED);
+   read_unlock(_lock);
+   }
+}
+
 EXPORT_SYMBOL(recalc_sigpending);
 EXPORT_SYMBOL_GPL(dequeue_signal);
 EXPORT_SYMBOL(flush_signals);
--- PT/kernel/exit.c~1_stop_exit2007-12-02 15:53:35.0 +0300
+++ PT/kernel/exit.c2007-12-05 19:50:26.0 +0300
@@ -945,7 +945,7 @@ fastcall NORET_TYPE void do_exit(long co
schedule();
}
 
-   tsk->flags |= PF_EXITING;
+   exit_signals(tsk);  /* sets PF_EXITING */
/*
 * tsk->flags are checked in the futex code to protect against
 * an exiting task cleaning up the robust pi futexes.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] fix group stop with exit race

2007-12-05 Thread Oleg Nesterov
do_signal_stop() counts all sub-thread and sets -group_stop_count accordingly.
Every thread should decrement -group_stop_count and stop, the last one should
notify the parent.

However a sub-thread can exit before it notices the signal_pending(), or it may
be somewhere in do_exit() already. In that case the group stop never finishes
properly.

Note: this is a minimal fix, we can add some optimizations later. Say we can
return quickly if thread_group_empty(). Also, we can move some signal related
code from exit_notify() to exit_signals().

Signed-off-by: Oleg Nesterov [EMAIL PROTECTED]

--- PT/include/linux/signal.h~1_stop_exit   2007-11-20 17:16:10.0 
+0300
+++ PT/include/linux/signal.h   2007-12-05 19:49:14.0 +0300
@@ -241,6 +241,7 @@ extern int show_unhandled_signals;
 
 struct pt_regs;
 extern int get_signal_to_deliver(siginfo_t *info, struct k_sigaction 
*return_ka, struct pt_regs *regs, void *cookie);
+extern void exit_signals(struct task_struct *tsk);
 
 extern struct kmem_cache *sighand_cachep;
 
--- PT/kernel/signal.c~1_stop_exit  2007-12-02 17:05:15.0 +0300
+++ PT/kernel/signal.c  2007-12-05 20:17:20.0 +0300
@@ -1707,7 +1707,7 @@ static int do_signal_stop(int signr)
 * stop is always done with the siglock held,
 * so this check has no races.
 */
-   if (!t-exit_state 
+   if (!(t-flags  PF_EXITING) 
!is_task_stopped_or_traced(t)) {
stop_count++;
signal_wake_up(t, 0);
@@ -1868,6 +1868,31 @@ relock:
return signr;
 }
 
+void exit_signals(struct task_struct *tsk)
+{
+   int group_stop = 0;
+
+   spin_lock_irq(tsk-sighand-siglock);
+   if (unlikely(tsk-signal-group_stop_count) 
+   !--tsk-signal-group_stop_count) {
+   tsk-signal-flags = SIGNAL_STOP_STOPPED;
+   group_stop = 1;
+   }
+
+   /*
+* From now this task is not visible for group-wide signals,
+* see wants_signal(), do_signal_stop().
+*/
+   tsk-flags |= PF_EXITING;
+   spin_unlock_irq(tsk-sighand-siglock);
+
+   if (unlikely(group_stop)) {
+   read_lock(tasklist_lock);
+   do_notify_parent_cldstop(tsk, CLD_STOPPED);
+   read_unlock(tasklist_lock);
+   }
+}
+
 EXPORT_SYMBOL(recalc_sigpending);
 EXPORT_SYMBOL_GPL(dequeue_signal);
 EXPORT_SYMBOL(flush_signals);
--- PT/kernel/exit.c~1_stop_exit2007-12-02 15:53:35.0 +0300
+++ PT/kernel/exit.c2007-12-05 19:50:26.0 +0300
@@ -945,7 +945,7 @@ fastcall NORET_TYPE void do_exit(long co
schedule();
}
 
-   tsk-flags |= PF_EXITING;
+   exit_signals(tsk);  /* sets PF_EXITING */
/*
 * tsk-flags are checked in the futex code to protect against
 * an exiting task cleaning up the robust pi futexes.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] fix group stop with exit race

2007-12-05 Thread Davide Libenzi
On Wed, 5 Dec 2007, Oleg Nesterov wrote:

 do_signal_stop() counts all sub-thread and sets -group_stop_count 
 accordingly.
 Every thread should decrement -group_stop_count and stop, the last one should
 notify the parent.
 
 However a sub-thread can exit before it notices the signal_pending(), or it 
 may
 be somewhere in do_exit() already. In that case the group stop never finishes
 properly.
 
 Note: this is a minimal fix, we can add some optimizations later. Say we can
 return quickly if thread_group_empty(). Also, we can move some signal related
 code from exit_notify() to exit_signals().
 
 Signed-off-by: Oleg Nesterov [EMAIL PROTECTED]

Looks OK for me, even though we're doing more work on the exit path. OTOH 
I don't see a non-racy way of doing it w/out grabbing the lock. Did you 
try to bench how much this change costs?
Anyway, looks sane to me...

Acked-by: Davide Libenzi [EMAIL PROTECTED]



- Davide


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/