On Mon, Jul 22, 2019 at 09:27:08AM -0700, Linus Torvalds wrote: > On Mon, Jul 22, 2019 at 7:26 AM Christian Brauner <[email protected]> > wrote: > > > > This contains a fix for pidfd polling. It ensures that the task's exit > > state is visible to all waiters: > > Hmm. > > I've pulled this, but the exit_state thing has been very fragile > before, and I'm not entirely happy with how this just changes where it > is set. I guess the movement here is all inside the tasklist_lock, so > it's not that big of a deal, but still.. > > I would *really* like Oleg to take a look.
Oh, sorry. Oleg did take a look. See: https://lore.kernel.org/lkml/[email protected]/ https://lore.kernel.org/lkml/[email protected]/ > > Also, and the primary reason I write this email is that this basically > makes the "EXIT_ZOMBIE / EXIT_DEAD" state handling look all kinds of > crazy. You set it to EXIT_ZOMBIE potentially _twice_. Whaa? > > So if we set EXIT_ZOMBIE early, then I think we should change the > EXIT_DEAD case too. IOW, do something like this on top: > > --- a/kernel/exit.c > +++ b/kernel/exit.c > @@ -734,9 +734,10 @@ static void exit_notify(struct task_struct > *tsk, int group_dead) > autoreap = true; > } > > - tsk->exit_state = autoreap ? EXIT_DEAD : EXIT_ZOMBIE; > - if (tsk->exit_state == EXIT_DEAD) > + if (autoreap) { > + tsk->exit_state = EXIT_DEAD; > list_add(&tsk->ptrace_entry, &dead); > + } > > /* mt-exec, de_thread() is waiting for group leader */ > if (unlikely(tsk->signal->notify_count < 0)) > > where now the logic becomes "ok, we turned into a zombie above, and if > we autoreap this thread then we turn the zombie into a fully dead > thread". > > Because currently we end up having "first turn it into a zombie", then > "set it to zombie or dead depending on autoreap" and then "if we > turned it into dead, move it to the dead list". > > That just feels confused to me. Comments? Agreed. But that codepath is so core-kernel that I really felt more comfortable just doing the absolut minimal thing so that when things bite us we see it right away. There's no harm in sending a cleanup for this later I think, when we haven't hit any weirdness with the current change. Does that sound ok? Christian

