So if a process is in 'R' (TASK_RUNNING) state, the fake signal should be sent to the process, and should later be delivered before the process returns to the user mode. As a result, the process should be able to enter the FROZEN state within 1ms (the timer interrupt interval).
>From brian's example, I think that might be related to some race conditions (due to many fork() ? e.g. process is added to the cgroup while at the same time the cgroup is being frozen.) Let me know if you have any findings. - Jie On Fri, Sep 21, 2012 at 10:03 PM, Jie Yu <[email protected]> wrote: > Here is the kernel flow when user echo FROZEN to freezer.state to freeze a > cgroup. > > Hopefully, this will be useful to you. > > (I am looking at the code of linux-2.6.39) > > 1) freezer_write(...) --> freezer_change_state(...) > --> try_to_freeze_cgroup(...) (kernel/cgroup_freezer.c) > > 2) try_fo_freeze_cgroup(...) will iterate all the tasks in the given > cgroup: > >> ... >> cgroup_iter_start(cgroup, &it); >> while ((task = cgroup_iter_next(cgroup, &it))) { >> if (!freeze_task(task, true)) >> continue; >> if (frozen(task)) >> continue; >> if (!freezing(task) && !freezer_should_skip(task)) >> num_cant_freeze_now++; >> } >> cgroup_iter_end(cgroup, &it); > > return num_cant_freeze_now ? -EBUSY : 0; > > > So, for each task in the cgroup, freeze_task(...) will be invoked > > 3) freeze_task(p) (in kernel/freezer.c) > So basically, what this function will do is to set a 'FREEZE' flag in > process 'p' (set_freeze_flag(p)), and send a fake signal to process 'p' by > invoking fake_signal_wake_up(p) which will also try to wake the process 'p' > up (very important!) > > 4) fake_signal_wake_up(p) --> signal_wake_up(p, 0) > > 5) signal_wake_up(p, 0) (kernel/signal.c) > > set_tsk_thread_flag(p, TIF_SIGPENDING); >> ... >> if (!wake_up_state(p, TASK_INTERRUPTIBLE)) >> kick_process(p); > > > First, the function set flag TIF_SIGPENDING in process p. Then, this > function will wake up process 'p' to make sure that p will try to handle > the fake signal when p is about to return to the user mode (Linux kernel > will check TIF_SIGPENDING everytime before it returns to user mode to check > any pending signals) > > 6) When p see the faked pending signal, it will call do_signal(...) > (arch/x86/kernel/signal.c) > This function will call get_signal_to_deliver(...) (kernel/signal.c) > > 7) The first line of get_signal_to_deliver(...) will call > try_to_freeze(...), if the FREEZE flag is set, the process will enter a > function called refrigerator(...) (in kernel/freezer.c) which will mark the > process as FROZEN and mark self as TASK_UNINTERRUPTIBLE, and call > schedule() to release the cpu. > > - Jie > > On Fri, Sep 21, 2012 at 9:29 PM, Jie Yu <[email protected]> wrote: > >> Ben, >> >> The retry does not work? The process remains in 'R' after you echo >> "FROZEN" to freezer.state? >> >> So I expect that you're correct, and we'll also need to send explicit >>> SIGKILLs to those processes still in R (in fact, probably just to all >>> processes still in the cgroup). >> >> >> Will that cause potential problems if there are more than 1 process in >> 'R' because the kill is not atomic. >> >> - Jie >> >> On Fri, Sep 21, 2012 at 9:10 PM, Benjamin Hindman <[email protected]>wrote: >> >>> >>> >>> > On Sept. 21, 2012, 7 p.m., Vinod Kone wrote: >>> > > lgtm. i've a feeling we need to also do a force kill. but we can do >>> this after we see how brian's test pans out. >>> >>> I tried just setting FREEZING to the cgroup freezer.state manually and >>> that didn't seem to work. Meanwhile, I sent a SIGKILL to the process in the >>> cgroup still in R, and that got everything to cleanup. So I expect that >>> you're correct, and we'll also need to send explicit SIGKILLs to those >>> processes still in R (in fact, probably just to all processes still in the >>> cgroup). Review incoming. >>> >>> >>> - Benjamin >>> >>> >>> ----------------------------------------------------------- >>> This is an automatically generated e-mail. To reply, visit: >>> https://reviews.apache.org/r/7203/#review11794 >>> ----------------------------------------------------------- >>> >>> >>> On Sept. 21, 2012, 2:02 a.m., Benjamin Hindman wrote: >>> > >>> > ----------------------------------------------------------- >>> > This is an automatically generated e-mail. To reply, visit: >>> > https://reviews.apache.org/r/7203/ >>> > ----------------------------------------------------------- >>> > >>> > (Updated Sept. 21, 2012, 2:02 a.m.) >>> > >>> > >>> > Review request for mesos, Vinod Kone, Brian Wickman, and Jie Yu. >>> > >>> > >>> > Description >>> > ------- >>> > >>> > See summary and >>> http://www.kernel.org/doc/Documentation/cgroups/freezer-subsystem.txt: >>> > >>> > It's important to note that freezing can be incomplete. In that case >>> we return >>> > EBUSY. This means that some tasks in the cgroup are busy doing >>> something that >>> > prevents us from completely freezing the cgroup at this time. After >>> EBUSY, >>> > the cgroup will remain partially frozen -- reflected by freezer.state >>> reporting >>> > "FREEZING" when read. The state will remain "FREEZING" until one of >>> these >>> > things happens: >>> > >>> > 1) Userspace cancels the freezing operation by writing "THAWED" >>> to >>> > the freezer.state file >>> > 2) Userspace retries the freezing operation by writing "FROZEN" >>> to >>> > the freezer.state file (writing "FREEZING" is not legal >>> > and returns EINVAL) >>> > 3) The tasks that blocked the cgroup from entering the "FROZEN" >>> > state disappear from the cgroup's set of tasks. >>> > >>> > >>> > Diffs >>> > ----- >>> > >>> > src/linux/cgroups.cpp 4efd06e >>> > >>> > Diff: https://reviews.apache.org/r/7203/diff/ >>> > >>> > >>> > Testing >>> > ------- >>> > >>> > >>> > Thanks, >>> > >>> > Benjamin Hindman >>> > >>> > >>> >>> >> >
