Re: [PATCH V3] exit: trigger panic when global init has exited

2021-03-24 Thread qianli zhao
>> But,my patch has another purpose,protect some key variables(such
>> as:task->mm,task->nsproxy,etc) to recover init coredump from
>> fulldump,if sub-threads finish do_exit(),

> Yes I know.

> But the purpose of this SIGNAL_GROUP_EXIT check is not clear and not
> documented. That is why I said it should be documented at least in the
> changelog.

Ok.
I will update the changelog as you suggest.

Oleg Nesterov  于2021年3月25日周四 上午2:12写道:
>
> Hi,
>
> On 03/23, qianli zhao wrote:
> >
> > Hi,Oleg
> >
> > > You certainly don't understand me :/
> >
> > > Please read my email you quoted below. I didn't mean the current logic.
> > > I meant the logic after your patch which moves atomic_dec_and_test() and
> > > panic() before exit_signals().
> >
> > Sorry, I think I see what you mean now.
> >
> > You mean that after apply my patch,SIGNAL_GROUP_EXIT no longer needs
> > to be tested or avoid zap_pid_ns_processes()->BUG().
> > Yes,your consideration is correct.
>
> OK, great
>
> > But,my patch has another purpose,protect some key variables(such
> > as:task->mm,task->nsproxy,etc) to recover init coredump from
> > fulldump,if sub-threads finish do_exit(),
>
> Yes I know.
>
> But the purpose of this SIGNAL_GROUP_EXIT check is not clear and not
> documented. That is why I said it should be documented at least in the
> changelog.
>
> Oleg.
>


Re: [PATCH V3] exit: trigger panic when global init has exited

2021-03-24 Thread Oleg Nesterov
Hi,

On 03/23, qianli zhao wrote:
>
> Hi,Oleg
>
> > You certainly don't understand me :/
>
> > Please read my email you quoted below. I didn't mean the current logic.
> > I meant the logic after your patch which moves atomic_dec_and_test() and
> > panic() before exit_signals().
>
> Sorry, I think I see what you mean now.
>
> You mean that after apply my patch,SIGNAL_GROUP_EXIT no longer needs
> to be tested or avoid zap_pid_ns_processes()->BUG().
> Yes,your consideration is correct.

OK, great

> But,my patch has another purpose,protect some key variables(such
> as:task->mm,task->nsproxy,etc) to recover init coredump from
> fulldump,if sub-threads finish do_exit(),

Yes I know.

But the purpose of this SIGNAL_GROUP_EXIT check is not clear and not
documented. That is why I said it should be documented at least in the
changelog.

Oleg.



Re: [PATCH V3] exit: trigger panic when global init has exited

2021-03-23 Thread qianli zhao
Hi,Oleg

> You certainly don't understand me :/

> Please read my email you quoted below. I didn't mean the current logic.
> I meant the logic after your patch which moves atomic_dec_and_test() and
> panic() before exit_signals().

Sorry, I think I see what you mean now.

You mean that after apply my patch,SIGNAL_GROUP_EXIT no longer needs
to be tested or avoid zap_pid_ns_processes()->BUG().
Yes,your consideration is correct.
But,my patch has another purpose,protect some key variables(such
as:task->mm,task->nsproxy,etc) to recover init coredump from
fulldump,if sub-threads finish do_exit(),these variables of sub-task
will be lost,and we cannot parse the coredump of the init process
through the tool normally such as "gcore".


Oleg Nesterov  于2021年3月23日周二 下午5:00写道:
>
> On 03/23, qianli zhao wrote:
> >
> > Hi,Oleg
> >
> > > No, there is at least one alive init thread. If they all have exited, we 
> > > have
> > > the thread which calls panic() above.
> >
> > By current logic, setting PF_EXITING(exit_signals()) is before the
> > panic(),
>
> You certainly don't understand me :/
>
> Please read my email you quoted below. I didn't mean the current logic.
> I meant the logic after your patch which moves atomic_dec_and_test() and
> panic() before exit_signals().
>
> Oleg.
>
> > find_alive_thread() determines the PF_EXITING of all child
> > threads, the panic thread's PF_EXITING has been set before panic(),so
> > find_alive_thread() thinks this thread also dead, resulting in
> > find_alive_thread returning NULL.It is possible to trigger a
> > zap_pid_ns_processes()->BUG() in this case.
> > 
> > exit_signals(tsk);  /* sets PF_EXITING */
> > ...
> > group_dead = atomic_dec_and_test(>signal->live);
> > if (group_dead) {
> > if (unlikely(is_global_init(tsk)))
> > panic("Attempted to kill init!
> > exitcode=0x%08x\n",>//PF_EXITING has been set
> > tsk->signal->group_exit_code ?: (int)code);
> >
> > ===
> >
> > > Why do you think so? It can affect _any_ code which runs under
> > > "if (group_dead)". Again, I don't see anything wrong, but I didn't even
> > > try to audit these code paths.
> >
> > Yes,all places where checked the "signal->live" may be affected,but
> > even before my changes, each program that checks "signal->live" may
> > get different state(group_dead or not), depending on the timing of the
> > caller,this situation will not change after my change.
> > After my patch,"signal->live--" and other variable are set in a
> > different order(such as signal->live and PF_EXITING),this can cause
> > abnormalities in the logic associated with these two variables,that is
> > my thinking.
> > Of course, check all the "signal->live--" path is definitely
> > necessary,it's just the case above that we need more attention.
> >
> > Thanks
> >
> > Oleg Nesterov  于2021年3月23日周二 上午12:37写道:
> > >
> > > Hi,
> > >
> > > It seems that we don't understand each other.
> > >
> > > If we move atomic_dec_and_test(signal->live) and do
> > >
> > > if (group_dead && is_global_init)
> > > panic(...);
> > >
> > >
> > > before setting PF_EXITING like your patch does, then 
> > > zap_pid_ns_processes()
> > > simply won't be called.
> > >
> > > Because:
> > >
> > > On 03/21, qianli zhao wrote:
> > > >
> > > > Hi,Oleg
> > > >
> > > > > How? Perhaps I missed something again, but I don't think this is 
> > > > > possible.
> > > >
> > > > > zap_pid_ns_processes() simply won't be called, find_child_reaper() 
> > > > > will
> > > > > see the !PF_EXITING thread which calls panic().
> > > >
> > > > > So I think this should be documented somehow, at least in the 
> > > > > changelog.
> > > >
> > > > This problem occurs when both two init threads enter the do_exit,
> > > > One of the init thread is syscall sys_exit_group,and set 
> > > > SIGNAL_GROUP_EXIT
> > > > The other init thread perform ret_to_user()->get_signal() and found
> > > > SIGNAL_GROUP_EXIT is set,then do_group_exit()->do_exit(),since there
> > > > are no alive init threads it finally goes to
> > > > zap_pid_ns_processes()
> > >
> > > No, there is at least one alive init thread. If they all have exited, we 
> > > have
> > > the thread which calls panic() above.
> > >
> > > > and BUG().
> > >
> > > so we don't need the SIGNAL_GROUP_EXIT check to avoid this BUG().
> > >
> > > What have I missed?
> > >
> > > Oleg.
> > >
> >
>


Re: [PATCH V3] exit: trigger panic when global init has exited

2021-03-23 Thread Oleg Nesterov
On 03/23, qianli zhao wrote:
>
> Hi,Oleg
>
> > No, there is at least one alive init thread. If they all have exited, we 
> > have
> > the thread which calls panic() above.
>
> By current logic, setting PF_EXITING(exit_signals()) is before the
> panic(),

You certainly don't understand me :/

Please read my email you quoted below. I didn't mean the current logic.
I meant the logic after your patch which moves atomic_dec_and_test() and
panic() before exit_signals().

Oleg.

> find_alive_thread() determines the PF_EXITING of all child
> threads, the panic thread's PF_EXITING has been set before panic(),so
> find_alive_thread() thinks this thread also dead, resulting in
> find_alive_thread returning NULL.It is possible to trigger a
> zap_pid_ns_processes()->BUG() in this case.
> 
> exit_signals(tsk);  /* sets PF_EXITING */
> ...
> group_dead = atomic_dec_and_test(>signal->live);
> if (group_dead) {
> if (unlikely(is_global_init(tsk)))
> panic("Attempted to kill init!
> exitcode=0x%08x\n",>//PF_EXITING has been set
> tsk->signal->group_exit_code ?: (int)code);
> 
> ===
> 
> > Why do you think so? It can affect _any_ code which runs under
> > "if (group_dead)". Again, I don't see anything wrong, but I didn't even
> > try to audit these code paths.
> 
> Yes,all places where checked the "signal->live" may be affected,but
> even before my changes, each program that checks "signal->live" may
> get different state(group_dead or not), depending on the timing of the
> caller,this situation will not change after my change.
> After my patch,"signal->live--" and other variable are set in a
> different order(such as signal->live and PF_EXITING),this can cause
> abnormalities in the logic associated with these two variables,that is
> my thinking.
> Of course, check all the "signal->live--" path is definitely
> necessary,it's just the case above that we need more attention.
> 
> Thanks
> 
> Oleg Nesterov  于2021年3月23日周二 上午12:37写道:
> >
> > Hi,
> >
> > It seems that we don't understand each other.
> >
> > If we move atomic_dec_and_test(signal->live) and do
> >
> > if (group_dead && is_global_init)
> > panic(...);
> >
> >
> > before setting PF_EXITING like your patch does, then zap_pid_ns_processes()
> > simply won't be called.
> >
> > Because:
> >
> > On 03/21, qianli zhao wrote:
> > >
> > > Hi,Oleg
> > >
> > > > How? Perhaps I missed something again, but I don't think this is 
> > > > possible.
> > >
> > > > zap_pid_ns_processes() simply won't be called, find_child_reaper() will
> > > > see the !PF_EXITING thread which calls panic().
> > >
> > > > So I think this should be documented somehow, at least in the changelog.
> > >
> > > This problem occurs when both two init threads enter the do_exit,
> > > One of the init thread is syscall sys_exit_group,and set SIGNAL_GROUP_EXIT
> > > The other init thread perform ret_to_user()->get_signal() and found
> > > SIGNAL_GROUP_EXIT is set,then do_group_exit()->do_exit(),since there
> > > are no alive init threads it finally goes to
> > > zap_pid_ns_processes()
> >
> > No, there is at least one alive init thread. If they all have exited, we 
> > have
> > the thread which calls panic() above.
> >
> > > and BUG().
> >
> > so we don't need the SIGNAL_GROUP_EXIT check to avoid this BUG().
> >
> > What have I missed?
> >
> > Oleg.
> >
> 



Re: [PATCH V3] exit: trigger panic when global init has exited

2021-03-22 Thread qianli zhao
Hi,Oleg

> No, there is at least one alive init thread. If they all have exited, we have
> the thread which calls panic() above.

By current logic, setting PF_EXITING(exit_signals()) is before the
panic(),find_alive_thread() determines the PF_EXITING of all child
threads, the panic thread's PF_EXITING has been set before panic(),so
find_alive_thread() thinks this thread also dead, resulting in
find_alive_thread returning NULL.It is possible to trigger a
zap_pid_ns_processes()->BUG() in this case.

exit_signals(tsk);  /* sets PF_EXITING */
...
group_dead = atomic_dec_and_test(>signal->live);
if (group_dead) {
if (unlikely(is_global_init(tsk)))
panic("Attempted to kill init!
exitcode=0x%08x\n",>//PF_EXITING has been set
tsk->signal->group_exit_code ?: (int)code);

===

> Why do you think so? It can affect _any_ code which runs under
> "if (group_dead)". Again, I don't see anything wrong, but I didn't even
> try to audit these code paths.

Yes,all places where checked the "signal->live" may be affected,but
even before my changes, each program that checks "signal->live" may
get different state(group_dead or not), depending on the timing of the
caller,this situation will not change after my change.
After my patch,"signal->live--" and other variable are set in a
different order(such as signal->live and PF_EXITING),this can cause
abnormalities in the logic associated with these two variables,that is
my thinking.
Of course, check all the "signal->live--" path is definitely
necessary,it's just the case above that we need more attention.

Thanks

Oleg Nesterov  于2021年3月23日周二 上午12:37写道:
>
> Hi,
>
> It seems that we don't understand each other.
>
> If we move atomic_dec_and_test(signal->live) and do
>
> if (group_dead && is_global_init)
> panic(...);
>
>
> before setting PF_EXITING like your patch does, then zap_pid_ns_processes()
> simply won't be called.
>
> Because:
>
> On 03/21, qianli zhao wrote:
> >
> > Hi,Oleg
> >
> > > How? Perhaps I missed something again, but I don't think this is possible.
> >
> > > zap_pid_ns_processes() simply won't be called, find_child_reaper() will
> > > see the !PF_EXITING thread which calls panic().
> >
> > > So I think this should be documented somehow, at least in the changelog.
> >
> > This problem occurs when both two init threads enter the do_exit,
> > One of the init thread is syscall sys_exit_group,and set SIGNAL_GROUP_EXIT
> > The other init thread perform ret_to_user()->get_signal() and found
> > SIGNAL_GROUP_EXIT is set,then do_group_exit()->do_exit(),since there
> > are no alive init threads it finally goes to
> > zap_pid_ns_processes()
>
> No, there is at least one alive init thread. If they all have exited, we have
> the thread which calls panic() above.
>
> > and BUG().
>
> so we don't need the SIGNAL_GROUP_EXIT check to avoid this BUG().
>
> What have I missed?
>
> Oleg.
>


Re: [PATCH V3] exit: trigger panic when global init has exited

2021-03-22 Thread Oleg Nesterov
On 03/22, Oleg Nesterov wrote:
>
> On 03/22, qianli zhao wrote:
> >
> > Moving the decrement position should only affect between new and old
> > code position of movement of the decrement of
> > signal->live.
>
> Why do you think so? It can affect _any_ code which runs under
> "if (group_dead)". Again, I don't see anything wrong, but I didn't even
> try to audit these code paths.

forgot to mention...

and any code outside of do_exit() which checks signal->live.

Oleg.



Re: [PATCH V3] exit: trigger panic when global init has exited

2021-03-22 Thread Oleg Nesterov
On 03/22, qianli zhao wrote:
>
> Moving the decrement position should only affect between new and old
> code position of movement of the decrement of
> signal->live.

Why do you think so? It can affect _any_ code which runs under
"if (group_dead)". Again, I don't see anything wrong, but I didn't even
try to audit these code paths.

Oleg.



Re: [PATCH V3] exit: trigger panic when global init has exited

2021-03-22 Thread Oleg Nesterov
Hi,

It seems that we don't understand each other.

If we move atomic_dec_and_test(signal->live) and do

if (group_dead && is_global_init)
panic(...);


before setting PF_EXITING like your patch does, then zap_pid_ns_processes()
simply won't be called.

Because:

On 03/21, qianli zhao wrote:
>
> Hi,Oleg
>
> > How? Perhaps I missed something again, but I don't think this is possible.
>
> > zap_pid_ns_processes() simply won't be called, find_child_reaper() will
> > see the !PF_EXITING thread which calls panic().
>
> > So I think this should be documented somehow, at least in the changelog.
>
> This problem occurs when both two init threads enter the do_exit,
> One of the init thread is syscall sys_exit_group,and set SIGNAL_GROUP_EXIT
> The other init thread perform ret_to_user()->get_signal() and found
> SIGNAL_GROUP_EXIT is set,then do_group_exit()->do_exit(),since there
> are no alive init threads it finally goes to
> zap_pid_ns_processes()

No, there is at least one alive init thread. If they all have exited, we have
the thread which calls panic() above.

> and BUG().

so we don't need the SIGNAL_GROUP_EXIT check to avoid this BUG().

What have I missed?

Oleg.



Re: [PATCH V3] exit: trigger panic when global init has exited

2021-03-21 Thread qianli zhao
Hi,Eric,Oleg

> It is a bit tricky to tell if the movement of the decrement of
> signal->live is safe

It is hard to say whether it is safe or not,below are just a few of my thoughts

> That affects current_is_single threaded
> which is used by unshare, setns of the time namespace, and setting
> the selinux part of creds.

I think is ok in current_is_single_threaded,in check_unshare_flags()
and selinux_setprocattr()
change "signal->live--" position won't change the result,There is no
other dependency change to this patch and these function.

> The usage in kernel/cgroup/cgroup.c:css_task_iter_advance seems safe.
> Hmm, Maybe not.  Today cgroup_thread_change_begin is held around
> setting PF_EXITING before signal->live is decremented.  So there seem to
> be some subtle cgroup dependencies.

Moving the decrement position should only affect between new and old
code position of movement of the decrement of
signal->live.In this range,i think
acct_update_integrals(),sync_mm_rss() mainly updated some data,only
exit_signals() and sched_exit() need
attention.
cgroup_threadgroup_change_begin() is called in exit_signals(),and
css_task_iter_advance used "signal->live",it seems like it might be a
little related.
cgroup_threadgroup_change_begin() just give stable threadgroup for
user,and css_task_iter_advance only check if group is dead, decrement
of
signal->live and sets PF_EXITING seems like safe.

> I think if we are going to move the decrement of signal->live that
> should be it's own patch and be accompanied with a good description of
> why it is safe instead of having the decrement of signal->live be there
> as a side effect of another change.

I'm not sure how to describe whether this move is safe or not,from my
analysis, no side effects have been found.
Would you like tell me how to prove that or give me some suggestion?



Thanks

Eric W. Biederman  于2021年3月19日周五 上午3:09写道:
>
> Oleg Nesterov  writes:
>
> > On 03/18, qianli zhao wrote:
> >>
> >> Hi,Oleg
> >>
> >> Thank you for your reply.
> >>
> >> >> When init sub-threads running on different CPUs exit at the same time,
> >> >> zap_pid_ns_processe()->BUG() may be happened.
> >>
> >> > and why do you think your patch can't prevent this?
> >>
> >> > Sorry, I must have missed something. But it seems to me that you are 
> >> > trying
> >> > to fix the wrong problem. Yes, zap_pid_ns_processes() must not be called 
> >> > in
> >> > the root namespace, and this has nothing to do with CONFIG_PID_NS.
> >>
> >> Yes, i try to fix this exception by test SIGNAL_GROUP_EXIT and call
> >> panic before setting PF_EXITING to prevent zap_pid_ns_processes()
> >> being called when init do_exit().
> >
> > Ah, I didn't notice your patch does atomic_dec_and_test(signal->live)
> > before exit_signals() which sets PF_EXITING. Thanks for correcting me.
> >
> > So yes, I was wrong, your patch can prevent this. Although I'd like to
> > recheck if every do-something-if-group-dead action is correct in the
> > case we have a non-PF_EXITING thread...
> >
> > But then I don't understand the SIGNAL_GROUP_EXIT check added by your
> > patch. Do we really need it if we want to avoid zap_pid_ns_processes()
> > when the global init exits?
> >
> >> In addition, the patch also protects the init process state to
> >> successfully get usable init coredump.
> >
> > Could you spell please?
> >
> > Does this connect to SIGNAL_GROUP_EXIT check? Do you mean that you want
> > to panic earlier, before other init's sub-threads exit?
>
> That is my understanding.
>
> As I understand it this patch has two purposes:
> 1. Avoid the BUG_ON in zap_pid_ns_processes when !CONFIG_PID_NS
> 2. panic as early as possible so exiting threads don't removing
>interesting debugging state.
>
>
> It is a bit tricky to tell if the movement of the decrement of
> signal->live is safe.  That affects current_is_single threaded
> which is used by unshare, setns of the time namespace, and setting
> the selinux part of creds.
>
> The usage in kernel/cgroup/cgroup.c:css_task_iter_advance seems safe.
> Hmm, Maybe not.  Today cgroup_thread_change_begin is held around
> setting PF_EXITING before signal->live is decremented.  So there seem to
> be some subtle cgroup dependencies.
>
> The usages of group_dead in do_exit seem safe, as except for the new
> one everything is the same.
>
> We could definitely take advantage of knowing group_dead in exit_signals
> to simplify it's optimization to not rerouting signals to living
> threads.
>
>
> I think if we are going to move the decrement of signal->live that
> should be it's own patch and be accompanied with a good description of
> why it is safe instead of having the decrement of signal->live be there
> as a side effect of another change.
>
> Eric


Re: [PATCH V3] exit: trigger panic when global init has exited

2021-03-21 Thread qianli zhao
Hi,Oleg

> How? Perhaps I missed something again, but I don't think this is possible.

> zap_pid_ns_processes() simply won't be called, find_child_reaper() will
> see the !PF_EXITING thread which calls panic().

> So I think this should be documented somehow, at least in the changelog.

This problem occurs when both two init threads enter the do_exit,
One of the init thread is syscall sys_exit_group,and set SIGNAL_GROUP_EXIT
The other init thread perform ret_to_user()->get_signal() and found
SIGNAL_GROUP_EXIT is set,then do_group_exit()->do_exit(),since there
are no alive init threads it finally goes to
zap_pid_ns_processes() and BUG().

Please refer to the sequence diagram below(that has been written into
the changelog),and callstack is also below:
kernel log:
[   24.705376] Kernel panic - not syncing: Attempted to kill init!
exitcode=0x7f00
[   24.705382] CPU: 4 PID: 552 Comm: init Tainted: G S O
4.14.180-perf-g4483caa8ae80-dirty #1
[   24.705390] kernel BUG at include/linux/pid_namespace.h:98!

PID: 1  TASK: ffc973126900  CPU: 7   COMMAND: "init"
 #0 [ff800805ba60] perf_trace_kernel_panic_late at ff99ac0bcbcc
 #1 [ff800805bac0] die at ff99ac08dc64
 #2 [ff800805bb10] bug_handler at ff99ac08e398
 #3 [ff800805bbc0] brk_handler at ff99ac08529c
 #4 [ff800805bc80] do_debug_exception at ff99ac0814e4
->exception

/home/work/courbet-r-stable-build/kernel/msm-4.14/include/linux/pid_namespace.h:
98
96static inline void zap_pid_ns_processes(struct pid_namespace *ns)
97{
98 BUG();
99}
 #5 [ff800805bdf0] el1_dbg at ff99ac083298
 #6 [ff800805be20] do_exit at ff99ac0c22e8
 #7 [ff800805be80] do_group_exit at ff99ac0c2658
 #8 [ff800805beb0] sys_exit_group at ff99ac0c266c
 #9 [ff800805bff0] el0_svc_naked at ff99ac083cfc

PID: 552TASK: ffc9613c8f00  CPU: 4   COMMAND: "init"
 #0 [ff801455b870] __delay at ff99ad32cc14
 #1 [ff801455b8b0] __const_udelay at ff99ad32cd10
 #2 [ff801455b8c0] msm_trigger_wdog_bite at ff99ac5d5be0
 #3 [ff801455b980] do_msm_restart at ff99a3f8
 #4 [ff801455b9b0] machine_restart at ff99ac085dd0
 #5 [ff801455b9d0] emergency_restart at ff99ac0eb6dc
 #6 [ff801455baf0] panic at ff99ac0bd008
   > /home/work/courbet-r-stable-build/kernel/msm-4.14/kernel/exit.c:
842
836 if (group_dead) {
837 /*
838 * If the last thread of global init has exited, panic
839 * immediately to get a useable coredump.
840 */
841 if (unlikely(is_global_init(tsk)))
842 panic("Attempted to kill init! exitcode=0x%08x\n",
843 tsk->signal->group_exit_code ?: (int)code);

 #7 [ff801455bb70] do_exit at ff99ac0c257c
#8 [ff801455bbd0] do_group_exit at ff99ac0c2644
#9 [ff801455bcc0] get_signal at ff99ac0d1384
#10 [ff801455be60] do_notify_resume at ff99ac08b2a8
#11 [ff801455bff0] work_pending at ff99ac083b8c

core4  core7
... sys_exit_group()
do_group_exit()
  - sig->flags = SIGNAL_GROUP_EXIT
  - zap_other_threads()
   do_exit() //PF_EXITING is set
ret_to_user()
do_notify_resume()
get_signal()
- signal_group_exit
- goto fatal;
do_group_exit()
do_exit() //PF_EXITING is set
- panic("Attempted to kill init! exitcode=0x%08x\n")
exit_notify()
find_alive_thread() //no alive
sub-threads
   zap_pid_ns_processes()
//CONFIG_PID_NS is not set
   BUG()

Oleg Nesterov  于2021年3月20日周六 上午12:32写道:
>
> On 03/19, qianli zhao wrote:
> >
> > > But then I don't understand the SIGNAL_GROUP_EXIT check added by your
> > > patch. Do we really need it if we want to avoid zap_pid_ns_processes()
> > > when the global init exits?
> >
> > I think check SIGNAL_GROUP_EXIT is necessary,or panic() will happen
> > after all init sub-threads do_exit(),so the following two situations
> > will happen:
> > 1.According to the timing in the changelog,
> > zap_pid_ns_processes()->BUG() maybe happened.
>
> How? Perhaps I missed something again, but I don't think this is possible.
>
> zap_pid_ns_processes() simply won't be called, find_child_reaper() will
> see the !PF_EXITING thread which calls panic().
>
> So I think this should be documented somehow, at least in the changelog.
>
> Oleg.
>


Re: [PATCH V3] exit: trigger panic when global init has exited

2021-03-19 Thread Oleg Nesterov
On 03/19, qianli zhao wrote:
>
> I will think about the risks of movement of the decrement of
> signal->live before exit_signal().
> If is difficult to judge movement of the decrement of signal->live is
> safe,how about only test 'signal->live==1' not use group_dead?
>
> Such as:
> diff --git a/kernel/exit.c b/kernel/exit.c
> index 04029e3..87f3595 100644
> --- a/kernel/exit.c
> +++ b/kernel/exit.c
> @@ -767,6 +767,17 @@ void __noreturn do_exit(long code)
> validate_creds_for_do_exit(tsk);
>
> /*
> +* If global init has exited,
> +* panic immediately to get a useable coredump.
> +*/
> +   if (unlikely(is_global_init(tsk) &&
> +   ((atomic_read(>signal->live) == 1) ||/*current is
> last init thread*/

Just suppose signal->live == 2 and both init's sub-threads exit at the
same time. They both can see signal->live == 2, panic() won't be called.

Oleg.



Re: [PATCH V3] exit: trigger panic when global init has exited

2021-03-19 Thread Oleg Nesterov
On 03/19, qianli zhao wrote:
>
> > But then I don't understand the SIGNAL_GROUP_EXIT check added by your
> > patch. Do we really need it if we want to avoid zap_pid_ns_processes()
> > when the global init exits?
>
> I think check SIGNAL_GROUP_EXIT is necessary,or panic() will happen
> after all init sub-threads do_exit(),so the following two situations
> will happen:
> 1.According to the timing in the changelog,
> zap_pid_ns_processes()->BUG() maybe happened.

How? Perhaps I missed something again, but I don't think this is possible.

zap_pid_ns_processes() simply won't be called, find_child_reaper() will
see the !PF_EXITING thread which calls panic().

So I think this should be documented somehow, at least in the changelog.

Oleg.



Re: [PATCH V3] exit: trigger panic when global init has exited

2021-03-19 Thread Oleg Nesterov
On 03/18, Eric W. Biederman wrote:
>
> Oleg Nesterov  writes:
>
> > On 03/18, qianli zhao wrote:
> >>
> >> In addition, the patch also protects the init process state to
> >> successfully get usable init coredump.
> >
> > Could you spell please?
> >
> > Does this connect to SIGNAL_GROUP_EXIT check? Do you mean that you want
> > to panic earlier, before other init's sub-threads exit?
>
> That is my understanding.
>
> As I understand it this patch has two purposes:
> 1. Avoid the BUG_ON in zap_pid_ns_processes when !CONFIG_PID_NS
> 2. panic as early as possible so exiting threads don't removing
>interesting debugging state.

Yes, this was my understanding too, but the changelog didn't look
clear to me.

And I'd say that it is not that we want to avoid BUG_ON() in
zap_pid_ns_processes() when !CONFIG_PID_NS, we want to avoid
zap_pid_ns_processes() in the root namespace, regardless of
CONFIG_PID_NS.

> It is a bit tricky to tell if the movement of the decrement of
> signal->live is safe.

Agreed, this was my concern. I see nothing wrong at first glance,
but I can easily miss something.

Oleg.



Re: [PATCH V3] exit: trigger panic when global init has exited

2021-03-19 Thread qianli zhao
Hi,Eric

> As I understand it this patch has two purposes:
> 1. Avoid the BUG_ON in zap_pid_ns_processes when !CONFIG_PID_NS
> 2. panic as early as possible so exiting threads don't removing
>   interesting debugging state.

Your understanding is very correct,this is what my patch wants to do

> I think if we are going to move the decrement of signal->live that
> should be it's own patch and be accompanied with a good description of
> why it is safe instead of having the decrement of signal->live be there
> as a side effect of another change.

I will think about the risks of movement of the decrement of
signal->live before exit_signal().
If is difficult to judge movement of the decrement of signal->live is
safe,how about only test 'signal->live==1' not use group_dead?

Such as:
diff --git a/kernel/exit.c b/kernel/exit.c
index 04029e3..87f3595 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -767,6 +767,17 @@ void __noreturn do_exit(long code)
validate_creds_for_do_exit(tsk);

/*
+* If global init has exited,
+* panic immediately to get a useable coredump.
+*/
+   if (unlikely(is_global_init(tsk) &&
+   ((atomic_read(>signal->live) == 1) ||/*current is
last init thread*/
+(tsk->signal->flags & SIGNAL_GROUP_EXIT {
+   panic("Attempted to kill init! exitcode=0x%08x\n",
+   tsk->signal->group_exit_code ?: (int)code);
+   }
+
+   /*
 * We're taking recursive faults here in do_exit. Safest is to just
 * leave this task alone and wait for reboot.
 */
@@ -784,16 +795,9 @@ void __noreturn do_exit(long code)
if (tsk->mm)
sync_mm_rss(tsk->mm);
acct_update_integrals(tsk);
+
group_dead = atomic_dec_and_test(>signal->live);
if (group_dead) {
-   /*
-* If the last thread of global init has exited, panic
-* immediately to get a useable coredump.
-*/
-   if (unlikely(is_global_init(tsk)))
-   panic("Attempted to kill init! exitcode=0x%08x\n",
-   tsk->signal->group_exit_code ?: (int)code);
-

Eric W. Biederman  于2021年3月19日周五 上午3:09写道:
>
> Oleg Nesterov  writes:
>
> > On 03/18, qianli zhao wrote:
> >>
> >> Hi,Oleg
> >>
> >> Thank you for your reply.
> >>
> >> >> When init sub-threads running on different CPUs exit at the same time,
> >> >> zap_pid_ns_processe()->BUG() may be happened.
> >>
> >> > and why do you think your patch can't prevent this?
> >>
> >> > Sorry, I must have missed something. But it seems to me that you are 
> >> > trying
> >> > to fix the wrong problem. Yes, zap_pid_ns_processes() must not be called 
> >> > in
> >> > the root namespace, and this has nothing to do with CONFIG_PID_NS.
> >>
> >> Yes, i try to fix this exception by test SIGNAL_GROUP_EXIT and call
> >> panic before setting PF_EXITING to prevent zap_pid_ns_processes()
> >> being called when init do_exit().
> >
> > Ah, I didn't notice your patch does atomic_dec_and_test(signal->live)
> > before exit_signals() which sets PF_EXITING. Thanks for correcting me.
> >
> > So yes, I was wrong, your patch can prevent this. Although I'd like to
> > recheck if every do-something-if-group-dead action is correct in the
> > case we have a non-PF_EXITING thread...
> >
> > But then I don't understand the SIGNAL_GROUP_EXIT check added by your
> > patch. Do we really need it if we want to avoid zap_pid_ns_processes()
> > when the global init exits?
> >
> >> In addition, the patch also protects the init process state to
> >> successfully get usable init coredump.
> >
> > Could you spell please?
> >
> > Does this connect to SIGNAL_GROUP_EXIT check? Do you mean that you want
> > to panic earlier, before other init's sub-threads exit?
>
> That is my understanding.
>
> As I understand it this patch has two purposes:
> 1. Avoid the BUG_ON in zap_pid_ns_processes when !CONFIG_PID_NS
> 2. panic as early as possible so exiting threads don't removing
>interesting debugging state.
>
>
> It is a bit tricky to tell if the movement of the decrement of
> signal->live is safe.  That affects current_is_single threaded
> which is used by unshare, setns of the time namespace, and setting
> the selinux part of creds.
>
> The usage in kernel/cgroup/cgroup.c:css_task_iter_advance seems safe.
> Hmm, Maybe not.  Today cgroup_thread_change_begin is held around
> setting PF_EXITING before signal->live is decremented.  So there seem to
> be some subtle cgroup dependencies.
>
> The usages of group_dead in do_exit seem safe, as except for the new
> one everything is the same.
>
> We could definitely take advantage of knowing group_dead in exit_signals
> to simplify it's optimization to not rerouting signals to living
> threads.
>
>
> I think if we are going to move the decrement of signal->live that
> should be it's own patch and be accompanied with a good description 

Re: [PATCH V3] exit: trigger panic when global init has exited

2021-03-18 Thread qianli zhao
Hi,Oleg

> But then I don't understand the SIGNAL_GROUP_EXIT check added by your
> patch. Do we really need it if we want to avoid zap_pid_ns_processes()
> when the global init exits?

I think check SIGNAL_GROUP_EXIT is necessary,or panic() will happen
after all init sub-threads do_exit(),so the following two situations
will happen:
1.According to the timing in the changelog,
zap_pid_ns_processes()->BUG() maybe happened.
2.The key variables of each init sub-threads will be in the exit
state(such task->mm=NULL,task->flags=PF_EXITING,task->nsproxy=NULL),resulting
in the failure to parse coredump from fulldump.

So i think check SIGNAL_GROUP_EXIT is a simple and effective way to
prevent these

> Does this connect to SIGNAL_GROUP_EXIT check? Do you mean that you want
> to panic earlier, before other init's sub-threads exit?

Yes, my patch just want panic earlier before other init's sub-threads exit

Oleg Nesterov  于2021年3月19日周五 上午2:05写道:
>
> On 03/18, qianli zhao wrote:
> >
> > Hi,Oleg
> >
> > Thank you for your reply.
> >
> > >> When init sub-threads running on different CPUs exit at the same time,
> > >> zap_pid_ns_processe()->BUG() may be happened.
> >
> > > and why do you think your patch can't prevent this?
> >
> > > Sorry, I must have missed something. But it seems to me that you are 
> > > trying
> > > to fix the wrong problem. Yes, zap_pid_ns_processes() must not be called 
> > > in
> > > the root namespace, and this has nothing to do with CONFIG_PID_NS.
> >
> > Yes, i try to fix this exception by test SIGNAL_GROUP_EXIT and call
> > panic before setting PF_EXITING to prevent zap_pid_ns_processes()
> > being called when init do_exit().
>
> Ah, I didn't notice your patch does atomic_dec_and_test(signal->live)
> before exit_signals() which sets PF_EXITING. Thanks for correcting me.
>
> So yes, I was wrong, your patch can prevent this. Although I'd like to
> recheck if every do-something-if-group-dead action is correct in the
> case we have a non-PF_EXITING thread...
>
> But then I don't understand the SIGNAL_GROUP_EXIT check added by your
> patch. Do we really need it if we want to avoid zap_pid_ns_processes()
> when the global init exits?
>
> > In addition, the patch also protects the init process state to
> > successfully get usable init coredump.
>
> Could you spell please?
>
> Does this connect to SIGNAL_GROUP_EXIT check? Do you mean that you want
> to panic earlier, before other init's sub-threads exit?
>
> Thanks,
>
> Oleg.
>


Re: [PATCH V3] exit: trigger panic when global init has exited

2021-03-18 Thread Eric W. Biederman
Oleg Nesterov  writes:

> On 03/18, qianli zhao wrote:
>>
>> Hi,Oleg
>>
>> Thank you for your reply.
>>
>> >> When init sub-threads running on different CPUs exit at the same time,
>> >> zap_pid_ns_processe()->BUG() may be happened.
>>
>> > and why do you think your patch can't prevent this?
>>
>> > Sorry, I must have missed something. But it seems to me that you are trying
>> > to fix the wrong problem. Yes, zap_pid_ns_processes() must not be called in
>> > the root namespace, and this has nothing to do with CONFIG_PID_NS.
>>
>> Yes, i try to fix this exception by test SIGNAL_GROUP_EXIT and call
>> panic before setting PF_EXITING to prevent zap_pid_ns_processes()
>> being called when init do_exit().
>
> Ah, I didn't notice your patch does atomic_dec_and_test(signal->live)
> before exit_signals() which sets PF_EXITING. Thanks for correcting me.
>
> So yes, I was wrong, your patch can prevent this. Although I'd like to
> recheck if every do-something-if-group-dead action is correct in the
> case we have a non-PF_EXITING thread...
>
> But then I don't understand the SIGNAL_GROUP_EXIT check added by your
> patch. Do we really need it if we want to avoid zap_pid_ns_processes()
> when the global init exits?
>
>> In addition, the patch also protects the init process state to
>> successfully get usable init coredump.
>
> Could you spell please?
>
> Does this connect to SIGNAL_GROUP_EXIT check? Do you mean that you want
> to panic earlier, before other init's sub-threads exit?

That is my understanding.

As I understand it this patch has two purposes:
1. Avoid the BUG_ON in zap_pid_ns_processes when !CONFIG_PID_NS
2. panic as early as possible so exiting threads don't removing
   interesting debugging state.


It is a bit tricky to tell if the movement of the decrement of
signal->live is safe.  That affects current_is_single threaded
which is used by unshare, setns of the time namespace, and setting
the selinux part of creds.

The usage in kernel/cgroup/cgroup.c:css_task_iter_advance seems safe.
Hmm, Maybe not.  Today cgroup_thread_change_begin is held around
setting PF_EXITING before signal->live is decremented.  So there seem to
be some subtle cgroup dependencies.

The usages of group_dead in do_exit seem safe, as except for the new
one everything is the same.

We could definitely take advantage of knowing group_dead in exit_signals
to simplify it's optimization to not rerouting signals to living
threads.


I think if we are going to move the decrement of signal->live that
should be it's own patch and be accompanied with a good description of
why it is safe instead of having the decrement of signal->live be there
as a side effect of another change.

Eric


Re: [PATCH V3] exit: trigger panic when global init has exited

2021-03-18 Thread Oleg Nesterov
On 03/18, qianli zhao wrote:
>
> Hi,Oleg
>
> Thank you for your reply.
>
> >> When init sub-threads running on different CPUs exit at the same time,
> >> zap_pid_ns_processe()->BUG() may be happened.
>
> > and why do you think your patch can't prevent this?
>
> > Sorry, I must have missed something. But it seems to me that you are trying
> > to fix the wrong problem. Yes, zap_pid_ns_processes() must not be called in
> > the root namespace, and this has nothing to do with CONFIG_PID_NS.
>
> Yes, i try to fix this exception by test SIGNAL_GROUP_EXIT and call
> panic before setting PF_EXITING to prevent zap_pid_ns_processes()
> being called when init do_exit().

Ah, I didn't notice your patch does atomic_dec_and_test(signal->live)
before exit_signals() which sets PF_EXITING. Thanks for correcting me.

So yes, I was wrong, your patch can prevent this. Although I'd like to
recheck if every do-something-if-group-dead action is correct in the
case we have a non-PF_EXITING thread...

But then I don't understand the SIGNAL_GROUP_EXIT check added by your
patch. Do we really need it if we want to avoid zap_pid_ns_processes()
when the global init exits?

> In addition, the patch also protects the init process state to
> successfully get usable init coredump.

Could you spell please?

Does this connect to SIGNAL_GROUP_EXIT check? Do you mean that you want
to panic earlier, before other init's sub-threads exit?

Thanks,

Oleg.



Re: [PATCH V3] exit: trigger panic when global init has exited

2021-03-17 Thread qianli zhao
Hi,Oleg

Thank you for your reply.

>> When init sub-threads running on different CPUs exit at the same time,
>> zap_pid_ns_processe()->BUG() may be happened.

> and why do you think your patch can't prevent this?

> Sorry, I must have missed something. But it seems to me that you are trying
> to fix the wrong problem. Yes, zap_pid_ns_processes() must not be called in
> the root namespace, and this has nothing to do with CONFIG_PID_NS.

Yes, i try to fix this exception by test SIGNAL_GROUP_EXIT and call
panic before setting PF_EXITING to prevent zap_pid_ns_processes()
being called when init do_exit().
In addition, the patch also protects the init process state to
successfully get usable init coredump.

In my test,this patch works.

Oleg Nesterov  于2021年3月17日周三 下午10:38写道:
>
> On 03/17, Qianli Zhao wrote:
> >
> > From: Qianli Zhao 
> >
> > When init sub-threads running on different CPUs exit at the same time,
> > zap_pid_ns_processe()->BUG() may be happened.
>
> and why do you think your patch can't prevent this?
>
> Sorry, I must have missed something. But it seems to me that you are trying
> to fix the wrong problem. Yes, zap_pid_ns_processes() must not be called in
> the root namespace, and this has nothing to do with CONFIG_PID_NS.
>
> > And every thread status is abnormal after exit(PF_EXITING set,task->mm=NULL 
> > etc),
> > which makes it difficult to parse coredump from fulldump normally.
> > In order to fix the above problem, when any one init has been set to 
> > SIGNAL_GROUP_EXIT,
> > trigger panic immediately, and prevent other init threads from continuing 
> > to exit
> >
> > [   24.705376] Kernel panic - not syncing: Attempted to kill init! 
> > exitcode=0x7f00
> > [   24.705382] CPU: 4 PID: 552 Comm: init Tainted: G S O
> > 4.14.180-perf-g4483caa8ae80-dirty #1
> > [   24.705390] kernel BUG at include/linux/pid_namespace.h:98!
> >
> > PID: 552   CPU: 4   COMMAND: "init"
> > PID: 1 CPU: 7   COMMAND: "init"
> > core4   core7
> > ... sys_exit_group()
> > do_group_exit()
> >- sig->flags = SIGNAL_GROUP_EXIT
> >- zap_other_threads()
> > do_exit() //PF_EXITING is set
> > ret_to_user()
> > do_notify_resume()
> > get_signal()
> > - signal_group_exit
> > - goto fatal;
> > do_group_exit()
> > do_exit() //PF_EXITING is set
> > - panic("Attempted to kill init! exitcode=0x%08x\n")
> > exit_notify()
> > find_alive_thread() //no alive sub-threads
> > zap_pid_ns_processes()//CONFIG_PID_NS is 
> > not set
> > BUG()
> >
> > Signed-off-by: Qianli Zhao 
> > ---
> > V3:
> > - Use group_dead instead of thread_group_empty() to test single init exit.
> >
> > V2:
> > - Changelog update
> > - Remove wrong useage of SIGNAL_UNKILLABLE.
> > - Add thread_group_empty() test to handle single init thread exit
> > ---
> >  kernel/exit.c | 20 +++-
> >  1 file changed, 11 insertions(+), 9 deletions(-)
> >
> > diff --git a/kernel/exit.c b/kernel/exit.c
> > index 04029e3..32b74e4 100644
> > --- a/kernel/exit.c
> > +++ b/kernel/exit.c
> > @@ -766,6 +766,17 @@ void __noreturn do_exit(long code)
> >
> >   validate_creds_for_do_exit(tsk);
> >
> > + group_dead = atomic_dec_and_test(>signal->live);
> > + /*
> > +  * If global init has exited,
> > +  * panic immediately to get a useable coredump.
> > +  */
> > + if (unlikely(is_global_init(tsk) &&
> > + (group_dead || (tsk->signal->flags & SIGNAL_GROUP_EXIT {
> > + panic("Attempted to kill init! exitcode=0x%08x\n",
> > + tsk->signal->group_exit_code ?: (int)code);
> > + }
> > +
> >   /*
> >* We're taking recursive faults here in do_exit. Safest is to just
> >* leave this task alone and wait for reboot.
> > @@ -784,16 +795,7 @@ void __noreturn do_exit(long code)
> >   if (tsk->mm)
> >   sync_mm_rss(tsk->mm);
> >   acct_update_integrals(tsk);
> > - group_dead = atomic_dec_and_test(>signal->live);
> >   if (group_dead) {
> > - /*
> > -  * If the last thread of global init has exited, panic
> > -  * immediately to get a useable coredump.
> > -  */
> > - if (unlikely(is_global_init(tsk)))
> > - panic("Attempted to kill init! exitcode=0x%08x\n",
> > - tsk->signal->group_exit_code ?: (int)code);
> > -
> >  #ifdef CONFIG_POSIX_TIMERS
> >   hrtimer_cancel(>signal->real_timer);
> >   exit_itimers(tsk->signal);
> > --
> > 1.9.1
> >
>


Re: [PATCH V3] exit: trigger panic when global init has exited

2021-03-17 Thread Oleg Nesterov
On 03/17, Qianli Zhao wrote:
>
> From: Qianli Zhao 
>
> When init sub-threads running on different CPUs exit at the same time,
> zap_pid_ns_processe()->BUG() may be happened.

and why do you think your patch can't prevent this?

Sorry, I must have missed something. But it seems to me that you are trying
to fix the wrong problem. Yes, zap_pid_ns_processes() must not be called in
the root namespace, and this has nothing to do with CONFIG_PID_NS.

> And every thread status is abnormal after exit(PF_EXITING set,task->mm=NULL 
> etc),
> which makes it difficult to parse coredump from fulldump normally.
> In order to fix the above problem, when any one init has been set to 
> SIGNAL_GROUP_EXIT,
> trigger panic immediately, and prevent other init threads from continuing to 
> exit
>
> [   24.705376] Kernel panic - not syncing: Attempted to kill init! 
> exitcode=0x7f00
> [   24.705382] CPU: 4 PID: 552 Comm: init Tainted: G S O
> 4.14.180-perf-g4483caa8ae80-dirty #1
> [   24.705390] kernel BUG at include/linux/pid_namespace.h:98!
>
> PID: 552   CPU: 4   COMMAND: "init"
> PID: 1 CPU: 7   COMMAND: "init"
> core4   core7
> ... sys_exit_group()
> do_group_exit()
>- sig->flags = SIGNAL_GROUP_EXIT
>- zap_other_threads()
> do_exit() //PF_EXITING is set
> ret_to_user()
> do_notify_resume()
> get_signal()
> - signal_group_exit
> - goto fatal;
> do_group_exit()
> do_exit() //PF_EXITING is set
> - panic("Attempted to kill init! exitcode=0x%08x\n")
> exit_notify()
> find_alive_thread() //no alive sub-threads
> zap_pid_ns_processes()//CONFIG_PID_NS is not 
> set
> BUG()
>
> Signed-off-by: Qianli Zhao 
> ---
> V3:
> - Use group_dead instead of thread_group_empty() to test single init exit.
>
> V2:
> - Changelog update
> - Remove wrong useage of SIGNAL_UNKILLABLE.
> - Add thread_group_empty() test to handle single init thread exit
> ---
>  kernel/exit.c | 20 +++-
>  1 file changed, 11 insertions(+), 9 deletions(-)
>
> diff --git a/kernel/exit.c b/kernel/exit.c
> index 04029e3..32b74e4 100644
> --- a/kernel/exit.c
> +++ b/kernel/exit.c
> @@ -766,6 +766,17 @@ void __noreturn do_exit(long code)
>
>   validate_creds_for_do_exit(tsk);
>
> + group_dead = atomic_dec_and_test(>signal->live);
> + /*
> +  * If global init has exited,
> +  * panic immediately to get a useable coredump.
> +  */
> + if (unlikely(is_global_init(tsk) &&
> + (group_dead || (tsk->signal->flags & SIGNAL_GROUP_EXIT {
> + panic("Attempted to kill init! exitcode=0x%08x\n",
> + tsk->signal->group_exit_code ?: (int)code);
> + }
> +
>   /*
>* We're taking recursive faults here in do_exit. Safest is to just
>* leave this task alone and wait for reboot.
> @@ -784,16 +795,7 @@ void __noreturn do_exit(long code)
>   if (tsk->mm)
>   sync_mm_rss(tsk->mm);
>   acct_update_integrals(tsk);
> - group_dead = atomic_dec_and_test(>signal->live);
>   if (group_dead) {
> - /*
> -  * If the last thread of global init has exited, panic
> -  * immediately to get a useable coredump.
> -  */
> - if (unlikely(is_global_init(tsk)))
> - panic("Attempted to kill init! exitcode=0x%08x\n",
> - tsk->signal->group_exit_code ?: (int)code);
> -
>  #ifdef CONFIG_POSIX_TIMERS
>   hrtimer_cancel(>signal->real_timer);
>   exit_itimers(tsk->signal);
> --
> 1.9.1
>



[PATCH V3] exit: trigger panic when global init has exited

2021-03-17 Thread Qianli Zhao
From: Qianli Zhao 

When init sub-threads running on different CPUs exit at the same time,
zap_pid_ns_processe()->BUG() may be happened.
And every thread status is abnormal after exit(PF_EXITING set,task->mm=NULL 
etc),
which makes it difficult to parse coredump from fulldump normally.
In order to fix the above problem, when any one init has been set to 
SIGNAL_GROUP_EXIT,
trigger panic immediately, and prevent other init threads from continuing to 
exit

[   24.705376] Kernel panic - not syncing: Attempted to kill init! 
exitcode=0x7f00
[   24.705382] CPU: 4 PID: 552 Comm: init Tainted: G S O
4.14.180-perf-g4483caa8ae80-dirty #1
[   24.705390] kernel BUG at include/linux/pid_namespace.h:98!

PID: 552   CPU: 4   COMMAND: "init"
PID: 1 CPU: 7   COMMAND: "init"
core4   core7
... sys_exit_group()
do_group_exit()
   - sig->flags = SIGNAL_GROUP_EXIT
   - zap_other_threads()
do_exit() //PF_EXITING is set
ret_to_user()
do_notify_resume()
get_signal()
- signal_group_exit
- goto fatal;
do_group_exit()
do_exit() //PF_EXITING is set
- panic("Attempted to kill init! exitcode=0x%08x\n")
exit_notify()
find_alive_thread() //no alive sub-threads
zap_pid_ns_processes()//CONFIG_PID_NS is not set
BUG()

Signed-off-by: Qianli Zhao 
---
V3:
- Use group_dead instead of thread_group_empty() to test single init exit.

V2:
- Changelog update
- Remove wrong useage of SIGNAL_UNKILLABLE. 
- Add thread_group_empty() test to handle single init thread exit
---
 kernel/exit.c | 20 +++-
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/kernel/exit.c b/kernel/exit.c
index 04029e3..32b74e4 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -766,6 +766,17 @@ void __noreturn do_exit(long code)
 
validate_creds_for_do_exit(tsk);
 
+   group_dead = atomic_dec_and_test(>signal->live);
+   /*
+* If global init has exited,
+* panic immediately to get a useable coredump.
+*/
+   if (unlikely(is_global_init(tsk) &&
+   (group_dead || (tsk->signal->flags & SIGNAL_GROUP_EXIT {
+   panic("Attempted to kill init! exitcode=0x%08x\n",
+   tsk->signal->group_exit_code ?: (int)code);
+   }
+
/*
 * We're taking recursive faults here in do_exit. Safest is to just
 * leave this task alone and wait for reboot.
@@ -784,16 +795,7 @@ void __noreturn do_exit(long code)
if (tsk->mm)
sync_mm_rss(tsk->mm);
acct_update_integrals(tsk);
-   group_dead = atomic_dec_and_test(>signal->live);
if (group_dead) {
-   /*
-* If the last thread of global init has exited, panic
-* immediately to get a useable coredump.
-*/
-   if (unlikely(is_global_init(tsk)))
-   panic("Attempted to kill init! exitcode=0x%08x\n",
-   tsk->signal->group_exit_code ?: (int)code);
-
 #ifdef CONFIG_POSIX_TIMERS
hrtimer_cancel(>signal->real_timer);
exit_itimers(tsk->signal);
-- 
1.9.1