Re: PATCH? fix SIGNAL_STOP_DEQUEUED vs SIGCONT race

2007-09-03 Thread Roland McGrath
> Yes, I also thought about something like this, but tried to avoid because
> it adds some complications. OTOH, this is not the fast path.

I agree it's not real pleasing, and would be glad to find better ideas.

> I'll try to think a bit more about this, and update the patch according
> to your comments. Looks like we don't need to check SIGNAL_GROUP_EXIT
> here, we are doing this later anyway.

I look forward to your next version.  


Thanks,
Roland
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PATCH? fix SIGNAL_STOP_DEQUEUED vs SIGCONT race

2007-09-03 Thread Roland McGrath
 Yes, I also thought about something like this, but tried to avoid because
 it adds some complications. OTOH, this is not the fast path.

I agree it's not real pleasing, and would be glad to find better ideas.

 I'll try to think a bit more about this, and update the patch according
 to your comments. Looks like we don't need to check SIGNAL_GROUP_EXIT
 here, we are doing this later anyway.

I look forward to your next version.  


Thanks,
Roland
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PATCH? fix SIGNAL_STOP_DEQUEUED vs SIGCONT race

2007-09-01 Thread Oleg Nesterov
On 09/01, Roland McGrath wrote:
>
> > However, this changes the behaviour when the task is ptraced. If the 
> > debugger
> > doesn't clear ->exit_code, SIGSTOP always succeeds after ptrace_stop(), even
> > if SIGCONT was sent in between. I can't decide whether this change is good
> > or bad, hopefully Roland can clarify.
> 
> Hmm.  I think this is bad.  
> 
> First, considering only the single-threaded case, there are debugger vs
> SIGCONT races.  Someone does kill(pid,SIGSTOP);kill(pid,SIGCONT); while pid
> is debugged.  The mandate for end user behavior here is that pid cannot
> wind up sitting in job control stop in the end.  Say the debugger is
> e.g. strace, simply printing every signal and passing it through.
> So say it goes:
>   T   K   D
>   merrily running ... blocked in wait4
>   kill(K, SIGSTOP)
>   dequeue SIGSTOP
>-> ptrace_stop
>   wait4 -> K,{SIGSTOP}
>   kill(K, SIGCONT)
>   PTRACE_CONT,K,SIGSTOP
>   do_signal_stop(SIGSTOP)
>   wait4 -> K,{SIGSTOP}

Thanks Roland.

Yes, that is what I was worrying about.

> It's still probably a worthwhile cleanup to have the logic only in
> get_signal_to_deliver, and to fix the problem you cited.  It will take only
> a little extra code to handle the ptrace case too, i.e.
>   if (sig_kernel_stop(signr) &&
>   current->sighand->action[signr-1] == SIG_DFL &&
>   !(current->signal->flags & SIGNAL_GROUP_EXIT))
>   current->signal->flags |= SIGNAL_STOP_DEQUEUED;
>   ptrace_stop(signr, signr, info);
>   if (sig_kernel_stop(signr) && current->exit_code == signr &&
>   !(current->signal->flags & SIGNAL_STOP_DEQUEUED) &&
>   current->sighand->action[signr-1] == SIG_DFL)
>   current->exit_code = 0;

Yes, I also thought about something like this, but tried to avoid because
it adds some complications. OTOH, this is not the fast path.

I'll try to think a bit more about this, and update the patch according
to your comments. Looks like we don't need to check SIGNAL_GROUP_EXIT
here, we are doing this later anyway.

Thanks!

Oleg.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PATCH? fix SIGNAL_STOP_DEQUEUED vs SIGCONT race

2007-09-01 Thread Roland McGrath
> Move the setting of SIGNAL_STOP_DEQUEUED from dequeue_signal() to
> get_signal_to_deliver(), and set this flag only if we are really going to 
> stop.
> This also simplifies the code and makes the SIGNAL_STOP_DEQUEUED usage more
> understandable.

It looks like a nice cleanup to me.

> However, this changes the behaviour when the task is ptraced. If the debugger
> doesn't clear ->exit_code, SIGSTOP always succeeds after ptrace_stop(), even
> if SIGCONT was sent in between. I can't decide whether this change is good
> or bad, hopefully Roland can clarify.

Hmm.  I think this is bad.  

First, considering only the single-threaded case, there are debugger vs
SIGCONT races.  Someone does kill(pid,SIGSTOP);kill(pid,SIGCONT); while pid
is debugged.  The mandate for end user behavior here is that pid cannot
wind up sitting in job control stop in the end.  Say the debugger is
e.g. strace, simply printing every signal and passing it through.
So say it goes:
T   K   D
merrily running ... blocked in wait4
kill(K, SIGSTOP)
dequeue SIGSTOP
 -> ptrace_stop
wait4 -> K,{SIGSTOP}
kill(K, SIGCONT)
PTRACE_CONT,K,SIGSTOP
do_signal_stop(SIGSTOP)
wait4 -> K,{SIGSTOP}

The debugger did the obvious thing: just pass through what it got.
The killer did something POSIX guarantees leaves T running.
T is in job control stop, with a pending SIGCONT (normally impossible).
It's not the end of the world, since the next SIGKILL or SIGCONT will
always wake it up.  But it's not right.

There are related issues with multi-threaded job control stop and with
suddenly killing the debugger.  I could get into those in detail.  But I
think the case above illustrates why we need a stop signal pending
consideration by the debugger to be like a stop signal pending locking for
the is_current_pgrp_orphaned() check when a SIGCONT/SIGKILL comes in between.

It's still probably a worthwhile cleanup to have the logic only in
get_signal_to_deliver, and to fix the problem you cited.  It will take only
a little extra code to handle the ptrace case too, i.e.
if (sig_kernel_stop(signr) &&
current->sighand->action[signr-1] == SIG_DFL &&
!(current->signal->flags & SIGNAL_GROUP_EXIT))
current->signal->flags |= SIGNAL_STOP_DEQUEUED;
ptrace_stop(signr, signr, info);
if (sig_kernel_stop(signr) && current->exit_code == signr &&
!(current->signal->flags & SIGNAL_STOP_DEQUEUED) &&
current->sighand->action[signr-1] == SIG_DFL)
current->exit_code = 0;
maybe.


Thanks,
Roland
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PATCH? fix SIGNAL_STOP_DEQUEUED vs SIGCONT race

2007-09-01 Thread Roland McGrath
 Move the setting of SIGNAL_STOP_DEQUEUED from dequeue_signal() to
 get_signal_to_deliver(), and set this flag only if we are really going to 
 stop.
 This also simplifies the code and makes the SIGNAL_STOP_DEQUEUED usage more
 understandable.

It looks like a nice cleanup to me.

 However, this changes the behaviour when the task is ptraced. If the debugger
 doesn't clear -exit_code, SIGSTOP always succeeds after ptrace_stop(), even
 if SIGCONT was sent in between. I can't decide whether this change is good
 or bad, hopefully Roland can clarify.

Hmm.  I think this is bad.  

First, considering only the single-threaded case, there are debugger vs
SIGCONT races.  Someone does kill(pid,SIGSTOP);kill(pid,SIGCONT); while pid
is debugged.  The mandate for end user behavior here is that pid cannot
wind up sitting in job control stop in the end.  Say the debugger is
e.g. strace, simply printing every signal and passing it through.
So say it goes:
T   K   D
merrily running ... blocked in wait4
kill(K, SIGSTOP)
dequeue SIGSTOP
 - ptrace_stop
wait4 - K,{SIGSTOP}
kill(K, SIGCONT)
PTRACE_CONT,K,SIGSTOP
do_signal_stop(SIGSTOP)
wait4 - K,{SIGSTOP}

The debugger did the obvious thing: just pass through what it got.
The killer did something POSIX guarantees leaves T running.
T is in job control stop, with a pending SIGCONT (normally impossible).
It's not the end of the world, since the next SIGKILL or SIGCONT will
always wake it up.  But it's not right.

There are related issues with multi-threaded job control stop and with
suddenly killing the debugger.  I could get into those in detail.  But I
think the case above illustrates why we need a stop signal pending
consideration by the debugger to be like a stop signal pending locking for
the is_current_pgrp_orphaned() check when a SIGCONT/SIGKILL comes in between.

It's still probably a worthwhile cleanup to have the logic only in
get_signal_to_deliver, and to fix the problem you cited.  It will take only
a little extra code to handle the ptrace case too, i.e.
if (sig_kernel_stop(signr) 
current-sighand-action[signr-1] == SIG_DFL 
!(current-signal-flags  SIGNAL_GROUP_EXIT))
current-signal-flags |= SIGNAL_STOP_DEQUEUED;
ptrace_stop(signr, signr, info);
if (sig_kernel_stop(signr)  current-exit_code == signr 
!(current-signal-flags  SIGNAL_STOP_DEQUEUED) 
current-sighand-action[signr-1] == SIG_DFL)
current-exit_code = 0;
maybe.


Thanks,
Roland
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PATCH? fix SIGNAL_STOP_DEQUEUED vs SIGCONT race

2007-09-01 Thread Oleg Nesterov
On 09/01, Roland McGrath wrote:

  However, this changes the behaviour when the task is ptraced. If the 
  debugger
  doesn't clear -exit_code, SIGSTOP always succeeds after ptrace_stop(), even
  if SIGCONT was sent in between. I can't decide whether this change is good
  or bad, hopefully Roland can clarify.
 
 Hmm.  I think this is bad.  
 
 First, considering only the single-threaded case, there are debugger vs
 SIGCONT races.  Someone does kill(pid,SIGSTOP);kill(pid,SIGCONT); while pid
 is debugged.  The mandate for end user behavior here is that pid cannot
 wind up sitting in job control stop in the end.  Say the debugger is
 e.g. strace, simply printing every signal and passing it through.
 So say it goes:
   T   K   D
   merrily running ... blocked in wait4
   kill(K, SIGSTOP)
   dequeue SIGSTOP
- ptrace_stop
   wait4 - K,{SIGSTOP}
   kill(K, SIGCONT)
   PTRACE_CONT,K,SIGSTOP
   do_signal_stop(SIGSTOP)
   wait4 - K,{SIGSTOP}

Thanks Roland.

Yes, that is what I was worrying about.

 It's still probably a worthwhile cleanup to have the logic only in
 get_signal_to_deliver, and to fix the problem you cited.  It will take only
 a little extra code to handle the ptrace case too, i.e.
   if (sig_kernel_stop(signr) 
   current-sighand-action[signr-1] == SIG_DFL 
   !(current-signal-flags  SIGNAL_GROUP_EXIT))
   current-signal-flags |= SIGNAL_STOP_DEQUEUED;
   ptrace_stop(signr, signr, info);
   if (sig_kernel_stop(signr)  current-exit_code == signr 
   !(current-signal-flags  SIGNAL_STOP_DEQUEUED) 
   current-sighand-action[signr-1] == SIG_DFL)
   current-exit_code = 0;

Yes, I also thought about something like this, but tried to avoid because
it adds some complications. OTOH, this is not the fast path.

I'll try to think a bit more about this, and update the patch according
to your comments. Looks like we don't need to check SIGNAL_GROUP_EXIT
here, we are doing this later anyway.

Thanks!

Oleg.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


PATCH? fix SIGNAL_STOP_DEQUEUED vs SIGCONT race

2007-08-28 Thread Oleg Nesterov
Two threads, T1 and T2, SIGTTIN is SIG_DFL, SIGTTOU has a handler.

T1 does get_signal_to_deliver(), dequeus SIGTTIN, unlocks ->siglock.

SIGCONT comes in, nothing to do yet, just clear SIGNAL_STOP_DEQUEUED.

SIGTTOU is sent, T2 dequeues it and sets SIGNAL_STOP_DEQUEUED again.
It has a handler, we shouldn't stop, but

T1 continues, takes ->siglock, and calls do_signal_stop().

Move the setting of SIGNAL_STOP_DEQUEUED from dequeue_signal() to
get_signal_to_deliver(), and set this flag only if we are really going to stop.
This also simplifies the code and makes the SIGNAL_STOP_DEQUEUED usage more
understandable.

However, this changes the behaviour when the task is ptraced. If the debugger
doesn't clear ->exit_code, SIGSTOP always succeeds after ptrace_stop(), even
if SIGCONT was sent in between. I can't decide whether this change is good
or bad, hopefully Roland can clarify.

Signed-off-by: Oleg Nesterov <[EMAIL PROTECTED]>

--- t/kernel/signal.c~S_G_S 2007-08-23 16:02:57.0 +0400
+++ t/kernel/signal.c   2007-08-28 19:15:28.0 +0400
@@ -409,22 +409,7 @@ int dequeue_signal(struct task_struct *t
}
if (likely(tsk == current))
recalc_sigpending();
-   if (signr && unlikely(sig_kernel_stop(signr))) {
-   /*
-* Set a marker that we have dequeued a stop signal.  Our
-* caller might release the siglock and then the pending
-* stop signal it is about to process is no longer in the
-* pending bitmasks, but must still be cleared by a SIGCONT
-* (and overruled by a SIGKILL).  So those cases clear this
-* shared flag after we've set it.  Note that this flag may
-* remain set after the signal we return is ignored or
-* handled.  That doesn't matter because its only purpose
-* is to alert stop-signal processing code when another
-* processor has come along and cleared the flag.
-*/
-   if (!(tsk->signal->flags & SIGNAL_GROUP_EXIT))
-   tsk->signal->flags |= SIGNAL_STOP_DEQUEUED;
-   }
+
if (signr && likely(tsk == current) &&
 ((info->si_code & __SI_MASK) == __SI_TIMER) &&
 info->si_sys_private){
@@ -1683,9 +1668,6 @@ static int do_signal_stop(int signr)
struct signal_struct *sig = current->signal;
int stop_count;
 
-   if (!likely(sig->flags & SIGNAL_STOP_DEQUEUED))
-   return 0;
-
if (sig->group_stop_count > 0) {
/*
 * There is a group stop in progress.  We don't need to
@@ -1849,6 +1831,9 @@ relock:
continue;
 
if (sig_kernel_stop(signr)) {
+   if (current->signal->flags & SIGNAL_GROUP_EXIT)
+   continue;
+
/*
 * The default action is to stop all threads in
 * the thread group.  The job control signals
@@ -1860,14 +1845,20 @@ relock:
 * We need to check for that and bail out if necessary.
 */
if (signr != SIGSTOP) {
+   /*
+* Set a marker that we have dequeued a stop
+* signal. We are going to unlock ->siglock,
+* another signal can cancel group stop request
+* during this window. In that case this marker
+* will be cleared when we re-check below.
+*/
+   current->signal->flags |= SIGNAL_STOP_DEQUEUED;
spin_unlock_irq(>sighand->siglock);
-
-   /* signals can be posted during this window */
-
if (is_current_pgrp_orphaned())
goto relock;
-
spin_lock_irq(>sighand->siglock);
+   if (!(current->signal->flags & 
SIGNAL_STOP_DEQUEUED))
+   continue;
}
 
if (likely(do_signal_stop(signr))) {

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


PATCH? fix SIGNAL_STOP_DEQUEUED vs SIGCONT race

2007-08-28 Thread Oleg Nesterov
Two threads, T1 and T2, SIGTTIN is SIG_DFL, SIGTTOU has a handler.

T1 does get_signal_to_deliver(), dequeus SIGTTIN, unlocks -siglock.

SIGCONT comes in, nothing to do yet, just clear SIGNAL_STOP_DEQUEUED.

SIGTTOU is sent, T2 dequeues it and sets SIGNAL_STOP_DEQUEUED again.
It has a handler, we shouldn't stop, but

T1 continues, takes -siglock, and calls do_signal_stop().

Move the setting of SIGNAL_STOP_DEQUEUED from dequeue_signal() to
get_signal_to_deliver(), and set this flag only if we are really going to stop.
This also simplifies the code and makes the SIGNAL_STOP_DEQUEUED usage more
understandable.

However, this changes the behaviour when the task is ptraced. If the debugger
doesn't clear -exit_code, SIGSTOP always succeeds after ptrace_stop(), even
if SIGCONT was sent in between. I can't decide whether this change is good
or bad, hopefully Roland can clarify.

Signed-off-by: Oleg Nesterov [EMAIL PROTECTED]

--- t/kernel/signal.c~S_G_S 2007-08-23 16:02:57.0 +0400
+++ t/kernel/signal.c   2007-08-28 19:15:28.0 +0400
@@ -409,22 +409,7 @@ int dequeue_signal(struct task_struct *t
}
if (likely(tsk == current))
recalc_sigpending();
-   if (signr  unlikely(sig_kernel_stop(signr))) {
-   /*
-* Set a marker that we have dequeued a stop signal.  Our
-* caller might release the siglock and then the pending
-* stop signal it is about to process is no longer in the
-* pending bitmasks, but must still be cleared by a SIGCONT
-* (and overruled by a SIGKILL).  So those cases clear this
-* shared flag after we've set it.  Note that this flag may
-* remain set after the signal we return is ignored or
-* handled.  That doesn't matter because its only purpose
-* is to alert stop-signal processing code when another
-* processor has come along and cleared the flag.
-*/
-   if (!(tsk-signal-flags  SIGNAL_GROUP_EXIT))
-   tsk-signal-flags |= SIGNAL_STOP_DEQUEUED;
-   }
+
if (signr  likely(tsk == current) 
 ((info-si_code  __SI_MASK) == __SI_TIMER) 
 info-si_sys_private){
@@ -1683,9 +1668,6 @@ static int do_signal_stop(int signr)
struct signal_struct *sig = current-signal;
int stop_count;
 
-   if (!likely(sig-flags  SIGNAL_STOP_DEQUEUED))
-   return 0;
-
if (sig-group_stop_count  0) {
/*
 * There is a group stop in progress.  We don't need to
@@ -1849,6 +1831,9 @@ relock:
continue;
 
if (sig_kernel_stop(signr)) {
+   if (current-signal-flags  SIGNAL_GROUP_EXIT)
+   continue;
+
/*
 * The default action is to stop all threads in
 * the thread group.  The job control signals
@@ -1860,14 +1845,20 @@ relock:
 * We need to check for that and bail out if necessary.
 */
if (signr != SIGSTOP) {
+   /*
+* Set a marker that we have dequeued a stop
+* signal. We are going to unlock -siglock,
+* another signal can cancel group stop request
+* during this window. In that case this marker
+* will be cleared when we re-check below.
+*/
+   current-signal-flags |= SIGNAL_STOP_DEQUEUED;
spin_unlock_irq(current-sighand-siglock);
-
-   /* signals can be posted during this window */
-
if (is_current_pgrp_orphaned())
goto relock;
-
spin_lock_irq(current-sighand-siglock);
+   if (!(current-signal-flags  
SIGNAL_STOP_DEQUEUED))
+   continue;
}
 
if (likely(do_signal_stop(signr))) {

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/