Commit:     3c90e6e99b08f01d5684a3a07cceae6a543e4fa8
Parent:     502d26b524d8980f3ed80d9aec398e85671a8160
Author:     Srivatsa Vaddagiri <[EMAIL PROTECTED]>
AuthorDate: Fri Nov 9 22:39:39 2007 +0100
Committer:  Ingo Molnar <[EMAIL PROTECTED]>
CommitDate: Fri Nov 9 22:39:39 2007 +0100

    sched: fix copy_namespace() <-> sched_fork() dependency in do_fork
    Sukadev Bhattiprolu reported a kernel crash with control groups.
    There are couple of problems discovered by Suka's test:
    - The test requires the cgroup filesystem to be mounted with
      atleast the cpu and ns options (i.e both namespace and cpu
      controllers are active in the same hierarchy).
        # mkdir /dev/cpuctl
        # mount -t cgroup -ocpu,ns none cpuctl
        (or simply)
        # mount -t cgroup none cpuctl -> Will activate all controllers
                                         in same hierarchy.
    - The test invokes clone() with CLONE_NEWNS set. This causes a a new child
      to be created, also a new group 
      cgroup_clone) and the child is attached to the new group (cgroup_clone->
      attach_task->sched_move_task). At this point in time, the child's 
      related fields are uninitialized (including its on_rq field, which it has
      inherited from parent). As a result sched_move_task thinks its on
      runqueue, when it isn't.
      As a solution to this problem, I moved sched_fork() call, which
      initializes scheduler related fields on a new task, before
      copy_namespaces(). I am not sure though whether moving up will
      cause other side-effects. Do you see any issue?
    - The second problem exposed by this test is that task_new_fair()
      assumes that parent and child will be part of the same group (which
      needn't be as this test shows). As a result, cfs_rq->curr can be NULL
      for the child.
      The solution is to test for curr pointer being NULL in
    With the patch below, I could run ns_exec() fine w/o a crash.
    Reported-by: Sukadev Bhattiprolu <[EMAIL PROTECTED]>
    Signed-off-by: Srivatsa Vaddagiri <[EMAIL PROTECTED]>
    Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
 kernel/fork.c       |    6 +++---
 kernel/sched_fair.c |    3 ++-
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index 28a7401..8ca1a14 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1123,6 +1123,9 @@ static struct task_struct *copy_process(unsigned long 
        p->blocked_on = NULL; /* not blocked yet */
+       /* Perform scheduler related setup. Assign this task to a CPU. */
+       sched_fork(p, clone_flags);
        if ((retval = security_task_alloc(p)))
                goto bad_fork_cleanup_policy;
        if ((retval = audit_alloc(p)))
@@ -1212,9 +1215,6 @@ static struct task_struct *copy_process(unsigned long 
-       /* Perform scheduler related setup. Assign this task to a CPU. */
-       sched_fork(p, clone_flags);
        /* Now that the task is set up, run cgroup callbacks if
         * necessary. We need to run them before the task is visible
         * on the tasklist. */
diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
index 6c36147..d3c0307 100644
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -1067,8 +1067,9 @@ static void task_new_fair(struct rq *rq, struct 
task_struct *p)
        place_entity(cfs_rq, se, 1);
+       /* 'curr' will be NULL if the child belongs to a different group */
        if (sysctl_sched_child_runs_first && this_cpu == task_cpu(p) &&
-                       curr->vruntime < se->vruntime) {
+                       curr && curr->vruntime < se->vruntime) {
                 * Upon rescheduling, sched_class::put_prev_task() will place
                 * 'current' within the tree based on its new key value.
To unsubscribe from this list: send the line "unsubscribe git-commits-head" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at

Reply via email to