On 01/22, Pavel Tikhomirov wrote: > > > > >Hmm. could you explain how this change helps CRIU? I mean, why > >restorer can't do prctl(CHILD_SUBREAPER) before the first fork? > > Imagine we have these tree in pidns: > > 1: has_child_subreaper == 0 && is_child_subreaper == 0 > |-2: has_child_subreaper == 0 && is_child_subreaper == 1 > | |-3: has_child_subreaper == 0 && is_child_subreaper == 0 > | | |-5: has_child_subreaper == 0 && is_child_subreaper == 0 > | |-4: has_child_subreaper == 1 && is_child_subreaper == 0 > | | |-6: has_child_subreaper == 1 && is_child_subreaper == 0 > > before c/r: If 4 dies 6 will reparent to 2, if 3 dies 5 will reparent to 1. > after c/r: (where restorer had is_child_subreaper == 1, everybody in the > tree will have has_child_subreaper == 1) Everybody will reparent to 2.
This is clear, but this can only happen if 2 forks 3 and after that sets is_child_subreaper, right? And if someone actually does this then your patch can break this application, no? IOW. Currently CRIU can't restore the process tree with the same has_child_subreaper bits if some process forks before prctl(PR_SET_CHILD_SUBREAPER). It restores the tree as if prctl() was called before the 1st fork. So you change the semantics of PR_SET_CHILD_SUBREAPER and now CRIU is fine simply because you remove this feature: the sub-reaper can no longer pre-fork the children which should reparent to the previous reaper. I won't really argure, but I am not sure this is good idea... At least I think this should be clearly documented. > >You don't need this new member and descendants_lock. task_struct has > >the ->real_parent pointer so you can work the tree without recursion. > > Sorry I don't get how I can walk down the tree of all descendants with help > of ->real_parent pointer, can you please point on some example or explain a > bit more? (I see task_is_descendant() in security/yama/yama_lsm.c but we > will need to check it for every process, not only descendants, the latter > can be a lot faster.) I'll send a patch, probably a generic helper makes sense. Btw task_is_descendant() looks wrong at first glance. Oleg.

