On Mon, Nov 2, 2015 at 4:13 PM, Oleg Nesterov <[email protected]> wrote: > Hi Dmitry, > > On 11/02, Dmitry Vyukov wrote: >> >> WARNING: CPU: 1 PID: 1 at kernel/signal.c:334 >> task_participate_group_stop+0x157/0x1d0() >> Modules linked in: >> CPU: 1 PID: 1 Comm: init Not tainted 4.3.0 #48 >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 >> ffffffff82e40280 ffff88003eb0fae0 ffffffff819efe55 0000000000000000 >> ffff88003eb0fb20 ffffffff810ec871 ffffffff8110f4d7 ffff88003eb00000 >> ffff88003eb20000 0000000000000000 ffff88003eb0fbf8 ffff88003eb20000 >> Call Trace: >> [<ffffffff810eca35>] warn_slowpath_null+0x15/0x20 kernel/panic.c:480 >> [<ffffffff8110f4d7>] task_participate_group_stop+0x157/0x1d0 >> kernel/signal.c:334 >> [<ffffffff81113587>] do_signal_stop+0x1e7/0x6e0 kernel/signal.c:2060 >> [<ffffffff81116ab7>] get_signal+0x387/0x11b0 kernel/signal.c:2316 >> [<ffffffff8100cf0d>] do_signal+0x8d/0x19e0 arch/x86/kernel/signal.c:707 >> [<ffffffff81005d8d>] prepare_exit_to_usermode+0x11d/0x170 >> arch/x86/entry/common.c:251 >> [<ffffffff81005e83>] syscall_return_slowpath+0xa3/0x2b0 >> arch/x86/entry/common.c:317 >> [<ffffffff82d4f6a7>] int_ret_from_sys_call+0x25/0x8f >> arch/x86/entry/entry_64.S:281 >> ---[ end trace f6697fd630b7c361 ]--- >> >> >> The reproducer is (needs to be run as root): >> >> // autogenerated by syzkaller (http://github.com/google/syzkaller) >> #include <sys/ptrace.h> >> #include <unistd.h> >> >> int main() >> { >> int pid = 1; >> ptrace(PTRACE_ATTACH, pid, 0, 0); >> ptrace(PTRACE_SETOPTIONS, pid, 0, PTRACE_O_EXITKILL); >> sleep(1); >> return 0; >> } > > Thanks. > > Can't reproduce, but at first glance the problem looks clear...
Humm... did you run as root? It reproduces all the time on my 4.3 kernel VM. Also firmly killed my desktop running 3.13. >> Yes, it is weird and it kills init right afterwards. > > Could you confirm that this WARN_ON() happens _after_ the reproducer exits? > >> But I wasn't able >> to figure out what's the root cause (why task does not have >> JOBCTL_STOP_PENDING) and maybe the same WARNING can be triggered >> without root and/or with other than init process. So still posting it >> here. > > Yes I think you are right. SIGSTOP can race with SIGKILL which (unlike > SIGCONT) > doesn't clear JOBCTL_STOP_DEQUEUED/PENDING/etc. > > This is mostly fine, the task won't block in TASK_STOPPED if SIGKILL is > pending, > but still is not right and leads to the warning above: JOBCTL_STOP_PENDING > was not > set because do_signal_stop()->task_set_jobctl_pending() checks > fatal_signal_pending(). > > Probably the patch below should fix the problem, but I'd like to think more > before > I send the fix. Will test it. > Oleg. > > --- x/kernel/signal.c > +++ x/kernel/signal.c > @@ -2002,7 +2002,7 @@ static bool do_signal_stop(int signr) > WARN_ON_ONCE(signr & ~JOBCTL_STOP_SIGMASK); > > if (!likely(current->jobctl & JOBCTL_STOP_DEQUEUED) || > - unlikely(signal_group_exit(sig))) > + unlikely(fatal_signal_pending(current))) > return false; > /* > * There is no group stop already in progress. We must > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

