Re: [PATCH v3 20/20] signal: Don't restart fork when signals come in.
Oleg Nesterov writes: > On 07/26, Eric W. Biederman wrote: >> >> Are the earlier patches looking ok to you? > > I obviously like 1-15. > > "[PATCH 16/20] fork: Move and describe why the code examines PIDNS_ADDING" > is "interesting". I mean it is fine, but at the end of this series it doesn't > matter what we check first, PIDNS_ADDING or fatal_signal_pending() - restart > is not possible in both cases. > > > As for 17-20... Yes I am biased. But I still think the simple approach I tried > to propose from the very beginning is better. At least simpler, in that you do > not need to worry about all these special cases/reasons for signal_pending(). I think worrying about them all now results in a future where we don't have to worry about reasons why we can't let fork continue. Giving a better progress guarantee. Which ultimately should be more maintainable going forward. > And you can not imagine how much I hate "[PATCH 19/20] fork: Have new threads > join on-going signal group stops" ;) Because I spent HOURS looking at this > trivial > patch and I am still not sure... > > To clarify, the CLONE_THREAD with JOBCTL_STOP_PENDING case is simple, I am > mostly > worried about JOBCTL_TRAP_STOP/etc with or without CLONE_THREAD, this adds > some > subtle changes but unfortunately I failed to find something wrong so I > can't argue I can understand taking a hard look at JOBCTL_TRAP_STOP especially as it gets mixed in with the multi-task (whole process) stop handling when at least one of the tasks of a process are being ptraced. To make certain I understood your concern I took a second look at it myself. The ptrace actions are defined to only affect a single task, and except for multi-task stop handling all of the jobctl bits are used for ptrace actions. So I don't see how there is anything we could possibly miss in the jobctl bits. Eric
Re: [PATCH v3 20/20] signal: Don't restart fork when signals come in.
Oleg Nesterov writes: > On 07/26, Eric W. Biederman wrote: >> >> Are the earlier patches looking ok to you? > > I obviously like 1-15. > > "[PATCH 16/20] fork: Move and describe why the code examines PIDNS_ADDING" > is "interesting". I mean it is fine, but at the end of this series it doesn't > matter what we check first, PIDNS_ADDING or fatal_signal_pending() - restart > is not possible in both cases. > > > As for 17-20... Yes I am biased. But I still think the simple approach I tried > to propose from the very beginning is better. At least simpler, in that you do > not need to worry about all these special cases/reasons for signal_pending(). I think worrying about them all now results in a future where we don't have to worry about reasons why we can't let fork continue. Giving a better progress guarantee. Which ultimately should be more maintainable going forward. > And you can not imagine how much I hate "[PATCH 19/20] fork: Have new threads > join on-going signal group stops" ;) Because I spent HOURS looking at this > trivial > patch and I am still not sure... > > To clarify, the CLONE_THREAD with JOBCTL_STOP_PENDING case is simple, I am > mostly > worried about JOBCTL_TRAP_STOP/etc with or without CLONE_THREAD, this adds > some > subtle changes but unfortunately I failed to find something wrong so I > can't argue I can understand taking a hard look at JOBCTL_TRAP_STOP especially as it gets mixed in with the multi-task (whole process) stop handling when at least one of the tasks of a process are being ptraced. To make certain I understood your concern I took a second look at it myself. The ptrace actions are defined to only affect a single task, and except for multi-task stop handling all of the jobctl bits are used for ptrace actions. So I don't see how there is anything we could possibly miss in the jobctl bits. Eric
Re: [PATCH v3 20/20] signal: Don't restart fork when signals come in.
On 07/24, Eric W. Biederman wrote: > > Similarly the current code will also miss blocked > signals that are delivered to multiple process, as those signals will > not appear pending during fork. Well, I still think this needs a separate change and discussion... Let me repeat, I simply do not know if this is good or not, I don't know if the current behaviour is by design or it is mistake. OK, I won't argue but note that your patch doesn't really fix the problem, > + spin_lock_irq(>sighand->siglock); > + if (!(clone_flags & CLONE_THREAD)) > + hlist_add_head(, >signal->multiprocess); > + recalc_sigpending(); > + spin_unlock_irq(>sighand->siglock); > + retval = -ERESTARTNOINTR; > + if (signal_pending(current)) > + goto fork_out; because recalc_sigpending() will not notice the blocked multiprocess signal if it is already pending. Oleg.
Re: [PATCH v3 20/20] signal: Don't restart fork when signals come in.
On 07/24, Eric W. Biederman wrote: > > Similarly the current code will also miss blocked > signals that are delivered to multiple process, as those signals will > not appear pending during fork. Well, I still think this needs a separate change and discussion... Let me repeat, I simply do not know if this is good or not, I don't know if the current behaviour is by design or it is mistake. OK, I won't argue but note that your patch doesn't really fix the problem, > + spin_lock_irq(>sighand->siglock); > + if (!(clone_flags & CLONE_THREAD)) > + hlist_add_head(, >signal->multiprocess); > + recalc_sigpending(); > + spin_unlock_irq(>sighand->siglock); > + retval = -ERESTARTNOINTR; > + if (signal_pending(current)) > + goto fork_out; because recalc_sigpending() will not notice the blocked multiprocess signal if it is already pending. Oleg.
Re: [PATCH v3 20/20] signal: Don't restart fork when signals come in.
On 07/26, Eric W. Biederman wrote: > > Are the earlier patches looking ok to you? I obviously like 1-15. "[PATCH 16/20] fork: Move and describe why the code examines PIDNS_ADDING" is "interesting". I mean it is fine, but at the end of this series it doesn't matter what we check first, PIDNS_ADDING or fatal_signal_pending() - restart is not possible in both cases. As for 17-20... Yes I am biased. But I still think the simple approach I tried to propose from the very beginning is better. At least simpler, in that you do not need to worry about all these special cases/reasons for signal_pending(). And you can not imagine how much I hate "[PATCH 19/20] fork: Have new threads join on-going signal group stops" ;) Because I spent HOURS looking at this trivial patch and I am still not sure... To clarify, the CLONE_THREAD with JOBCTL_STOP_PENDING case is simple, I am mostly worried about JOBCTL_TRAP_STOP/etc with or without CLONE_THREAD, this adds some subtle changes but unfortunately I failed to find something wrong so I can't argue. Oleg.
Re: [PATCH v3 20/20] signal: Don't restart fork when signals come in.
On 07/26, Eric W. Biederman wrote: > > Are the earlier patches looking ok to you? I obviously like 1-15. "[PATCH 16/20] fork: Move and describe why the code examines PIDNS_ADDING" is "interesting". I mean it is fine, but at the end of this series it doesn't matter what we check first, PIDNS_ADDING or fatal_signal_pending() - restart is not possible in both cases. As for 17-20... Yes I am biased. But I still think the simple approach I tried to propose from the very beginning is better. At least simpler, in that you do not need to worry about all these special cases/reasons for signal_pending(). And you can not imagine how much I hate "[PATCH 19/20] fork: Have new threads join on-going signal group stops" ;) Because I spent HOURS looking at this trivial patch and I am still not sure... To clarify, the CLONE_THREAD with JOBCTL_STOP_PENDING case is simple, I am mostly worried about JOBCTL_TRAP_STOP/etc with or without CLONE_THREAD, this adds some subtle changes but unfortunately I failed to find something wrong so I can't argue. Oleg.
Re: [PATCH v3 20/20] signal: Don't restart fork when signals come in.
Oleg Nesterov writes: > On 07/24, Eric W. Biederman wrote: >> >> @@ -1979,6 +1983,8 @@ static __latent_entropy struct task_struct >> *copy_process( >> attach_pid(p, PIDTYPE_TGID); >> attach_pid(p, PIDTYPE_PGID); >> attach_pid(p, PIDTYPE_SID); >> +p->signal->shared_pending.signal = delayed.signal; > > Again, in this case we do not hold p->sighand->siglock (unless CLONE_SIGHAND), > so this should be done before list_add_tail/attach_pid above. Plus we need > some > sort of barrier. > > Or we can do > > if (!CLONE_SIGHAND) > spin_lock_nested(child->siglock); > > at the start of "if (likely(p->pid))" block. Good point. We want to exclude races between new signals comming in and gathering the information from the old queued signals. I will take a look. >> --- a/kernel/signal.c >> +++ b/kernel/signal.c >> @@ -1123,6 +1123,15 @@ static int __send_signal(int sig, struct siginfo >> *info, struct task_struct *t, >> out_set: >> signalfd_notify(t, sig); >> sigaddset(>signal, sig); >> + >> +/* Let multiprocess signals appear after on-going forks */ >> +if (type > PIDTYPE_TGID) { >> +struct multiprocess_signals *delayed; >> +hlist_for_each_entry(delayed, >signal->multiprocess, node) { >> +sigaddset(>signal, sig); > > This is not enough, I think... > > Suppose you send SIGSTOP and then SIGCONT to some process group. The 1st > SIGSTOP > will be queued correctly, but the 2nd SIGCONT won't flush the pending > SIGSTOP, you > need to modify prepare_signal() too. Good point. We can't have both SIGCONT and a stop signal (SIGSTOP or SIGTSTP) enqueued at the same time. And there is something in the prepare_signal code about parent notifications as well. I will look up the fine points of SIGCONT and SIGSTOP interaction and see what we should be doing here. Sigh. I thought this was going to be as simple as the sequence counter but this looks a tad more complicated. Are the earlier patches looking ok to you? Eric
Re: [PATCH v3 20/20] signal: Don't restart fork when signals come in.
Oleg Nesterov writes: > On 07/24, Eric W. Biederman wrote: >> >> @@ -1979,6 +1983,8 @@ static __latent_entropy struct task_struct >> *copy_process( >> attach_pid(p, PIDTYPE_TGID); >> attach_pid(p, PIDTYPE_PGID); >> attach_pid(p, PIDTYPE_SID); >> +p->signal->shared_pending.signal = delayed.signal; > > Again, in this case we do not hold p->sighand->siglock (unless CLONE_SIGHAND), > so this should be done before list_add_tail/attach_pid above. Plus we need > some > sort of barrier. > > Or we can do > > if (!CLONE_SIGHAND) > spin_lock_nested(child->siglock); > > at the start of "if (likely(p->pid))" block. Good point. We want to exclude races between new signals comming in and gathering the information from the old queued signals. I will take a look. >> --- a/kernel/signal.c >> +++ b/kernel/signal.c >> @@ -1123,6 +1123,15 @@ static int __send_signal(int sig, struct siginfo >> *info, struct task_struct *t, >> out_set: >> signalfd_notify(t, sig); >> sigaddset(>signal, sig); >> + >> +/* Let multiprocess signals appear after on-going forks */ >> +if (type > PIDTYPE_TGID) { >> +struct multiprocess_signals *delayed; >> +hlist_for_each_entry(delayed, >signal->multiprocess, node) { >> +sigaddset(>signal, sig); > > This is not enough, I think... > > Suppose you send SIGSTOP and then SIGCONT to some process group. The 1st > SIGSTOP > will be queued correctly, but the 2nd SIGCONT won't flush the pending > SIGSTOP, you > need to modify prepare_signal() too. Good point. We can't have both SIGCONT and a stop signal (SIGSTOP or SIGTSTP) enqueued at the same time. And there is something in the prepare_signal code about parent notifications as well. I will look up the fine points of SIGCONT and SIGSTOP interaction and see what we should be doing here. Sigh. I thought this was going to be as simple as the sequence counter but this looks a tad more complicated. Are the earlier patches looking ok to you? Eric
Re: [PATCH v3 20/20] signal: Don't restart fork when signals come in.
On 07/24, Eric W. Biederman wrote: > > @@ -1979,6 +1983,8 @@ static __latent_entropy struct task_struct > *copy_process( > attach_pid(p, PIDTYPE_TGID); > attach_pid(p, PIDTYPE_PGID); > attach_pid(p, PIDTYPE_SID); > + p->signal->shared_pending.signal = delayed.signal; Again, in this case we do not hold p->sighand->siglock (unless CLONE_SIGHAND), so this should be done before list_add_tail/attach_pid above. Plus we need some sort of barrier. Or we can do if (!CLONE_SIGHAND) spin_lock_nested(child->siglock); at the start of "if (likely(p->pid))" block. > --- a/kernel/signal.c > +++ b/kernel/signal.c > @@ -1123,6 +1123,15 @@ static int __send_signal(int sig, struct siginfo > *info, struct task_struct *t, > out_set: > signalfd_notify(t, sig); > sigaddset(>signal, sig); > + > + /* Let multiprocess signals appear after on-going forks */ > + if (type > PIDTYPE_TGID) { > + struct multiprocess_signals *delayed; > + hlist_for_each_entry(delayed, >signal->multiprocess, node) { > + sigaddset(>signal, sig); This is not enough, I think... Suppose you send SIGSTOP and then SIGCONT to some process group. The 1st SIGSTOP will be queued correctly, but the 2nd SIGCONT won't flush the pending SIGSTOP, you need to modify prepare_signal() too. Oleg.
Re: [PATCH v3 20/20] signal: Don't restart fork when signals come in.
On 07/24, Eric W. Biederman wrote: > > @@ -1979,6 +1983,8 @@ static __latent_entropy struct task_struct > *copy_process( > attach_pid(p, PIDTYPE_TGID); > attach_pid(p, PIDTYPE_PGID); > attach_pid(p, PIDTYPE_SID); > + p->signal->shared_pending.signal = delayed.signal; Again, in this case we do not hold p->sighand->siglock (unless CLONE_SIGHAND), so this should be done before list_add_tail/attach_pid above. Plus we need some sort of barrier. Or we can do if (!CLONE_SIGHAND) spin_lock_nested(child->siglock); at the start of "if (likely(p->pid))" block. > --- a/kernel/signal.c > +++ b/kernel/signal.c > @@ -1123,6 +1123,15 @@ static int __send_signal(int sig, struct siginfo > *info, struct task_struct *t, > out_set: > signalfd_notify(t, sig); > sigaddset(>signal, sig); > + > + /* Let multiprocess signals appear after on-going forks */ > + if (type > PIDTYPE_TGID) { > + struct multiprocess_signals *delayed; > + hlist_for_each_entry(delayed, >signal->multiprocess, node) { > + sigaddset(>signal, sig); This is not enough, I think... Suppose you send SIGSTOP and then SIGCONT to some process group. The 1st SIGSTOP will be queued correctly, but the 2nd SIGCONT won't flush the pending SIGSTOP, you need to modify prepare_signal() too. Oleg.