This moves the condition (tid != 1 && !tmp->child_reaper) to after idr
alloc, so it not only covers that first process in pid namespace has pid
1 in case of clone3(set_tid) requesting wrong pid, but also if idr
itself gives wrong pid for some reason.

This could've been the case before this patch, when creating first
process the alloc_pid()->pidfs_add_pid() code path fails, so that the
idr->idr_next is non zero anymore and next process calling to
alloc_pid(), will get 2 as a pid from idr_alloc_cyclic(). Though thanks
to PIDNS_ADDING logic, free_pid() disables further pid allocation in
this case and it does not lead to any real problem.

Note: This is also a preparation for the next patch in the series, which
will introduce an ability of creating init from the task different to
the task which had created the pid namespace. Needed to make sure that
init is always first, even in this new case.

Suggested-by: Oleg Nesterov <[email protected]>
Signed-off-by: Oleg Nesterov <[email protected]>
Signed-off-by: Pavel Tikhomirov <[email protected]>
--
v3: Split from main commit. Merge two checks of ->child_reaper into one.
v4: Update commit message about PIDNS_ADDING.
---
 kernel/pid.c | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/kernel/pid.c b/kernel/pid.c
index 76c2744493e2..ebf013f35cb3 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -215,12 +215,6 @@ struct pid *alloc_pid(struct pid_namespace *ns, pid_t 
*arg_set_tid,
                        retval = -EINVAL;
                        if (tid < 1 || tid >= pid_max[ns->level - i])
                                goto out_abort;
-                       /*
-                        * Also fail if a PID != 1 is requested and
-                        * no PID 1 exists.
-                        */
-                       if (tid != 1 && !READ_ONCE(tmp->child_reaper))
-                               goto out_abort;
                        retval = -EPERM;
                        if (!checkpoint_restore_ns_capable(tmp->user_ns))
                                goto out_abort;
@@ -296,9 +290,18 @@ struct pid *alloc_pid(struct pid_namespace *ns, pid_t 
*arg_set_tid,
 
                pid->numbers[i].nr = nr;
                pid->numbers[i].ns = tmp;
-               tmp = tmp->parent;
                i--;
                retried_preload = false;
+
+               /*
+                * PID 1 (init) must be created first.
+                */
+               if (!READ_ONCE(tmp->child_reaper) && nr != 1) {
+                       retval = -EINVAL;
+                       goto out_free;
+               }
+
+               tmp = tmp->parent;
        }
 
        /*
-- 
2.53.0


Reply via email to