So if a process is in 'R' (TASK_RUNNING) state, the fake signal should be
sent to the process, and should later be delivered before the process
returns to the user mode. As a result, the process should be able to enter
the FROZEN state within 1ms (the timer interrupt interval).

>From brian's example, I think that might be related to some race conditions
(due to many fork() ? e.g. process is added to the cgroup while at the same
time the cgroup is being frozen.)

Let me know if you have any findings.

- Jie

On Fri, Sep 21, 2012 at 10:03 PM, Jie Yu <[email protected]> wrote:

> Here is the kernel flow when user echo FROZEN to freezer.state to freeze a
> cgroup.
>
> Hopefully, this will be useful to you.
>
> (I am looking at the code of linux-2.6.39)
>
> 1) freezer_write(...) --> freezer_change_state(...)
> --> try_to_freeze_cgroup(...)  (kernel/cgroup_freezer.c)
>
> 2) try_fo_freeze_cgroup(...) will iterate all the tasks in the given
> cgroup:
>
>> ...
>> cgroup_iter_start(cgroup, &it);
>> while ((task = cgroup_iter_next(cgroup, &it))) {
>>     if (!freeze_task(task, true))
>>         continue;
>>     if (frozen(task))
>>         continue;
>>     if (!freezing(task) && !freezer_should_skip(task))
>>         num_cant_freeze_now++;
>> }
>> cgroup_iter_end(cgroup, &it);
>
> return num_cant_freeze_now ? -EBUSY : 0;
>
>
>  So, for each task in the cgroup, freeze_task(...) will be invoked
>
> 3) freeze_task(p) (in kernel/freezer.c)
> So basically, what this function will do is to set a 'FREEZE' flag in
> process 'p' (set_freeze_flag(p)), and send a fake signal to process 'p' by
> invoking fake_signal_wake_up(p) which will also try to wake the process 'p'
> up (very important!)
>
> 4) fake_signal_wake_up(p) --> signal_wake_up(p, 0)
>
> 5) signal_wake_up(p, 0)  (kernel/signal.c)
>
> set_tsk_thread_flag(p, TIF_SIGPENDING);
>> ...
>> if (!wake_up_state(p, TASK_INTERRUPTIBLE))
>>     kick_process(p);
>
>
> First, the function set flag TIF_SIGPENDING in process p. Then, this
> function will wake up process 'p' to make sure that p will try to handle
> the fake signal when p is about to return to the user mode (Linux kernel
> will check TIF_SIGPENDING everytime before it returns to user mode to check
> any pending signals)
>
> 6) When p see the faked pending signal, it will call do_signal(...)
>  (arch/x86/kernel/signal.c)
> This function will call get_signal_to_deliver(...) (kernel/signal.c)
>
> 7) The first line of get_signal_to_deliver(...) will call
> try_to_freeze(...), if the FREEZE flag is set, the process will enter a
> function called refrigerator(...) (in kernel/freezer.c) which will mark the
> process as FROZEN and mark self as TASK_UNINTERRUPTIBLE, and call
> schedule() to release the cpu.
>
> - Jie
>
> On Fri, Sep 21, 2012 at 9:29 PM, Jie Yu <[email protected]> wrote:
>
>> Ben,
>>
>> The retry does not work? The process remains in 'R' after you echo
>> "FROZEN" to freezer.state?
>>
>> So I expect that you're correct, and we'll also need to send explicit
>>> SIGKILLs to those processes still in R (in fact, probably just to all
>>> processes still in the cgroup).
>>
>>
>> Will that cause potential problems if there are more than 1 process in
>> 'R' because the kill is not atomic.
>>
>> - Jie
>>
>> On Fri, Sep 21, 2012 at 9:10 PM, Benjamin Hindman <[email protected]>wrote:
>>
>>>
>>>
>>> > On Sept. 21, 2012, 7 p.m., Vinod Kone wrote:
>>> > > lgtm. i've a feeling we need to also do a force kill. but we can do
>>> this after we see how brian's test pans out.
>>>
>>> I tried just setting FREEZING to the cgroup freezer.state manually and
>>> that didn't seem to work. Meanwhile, I sent a SIGKILL to the process in the
>>> cgroup still in R, and that got everything to cleanup. So I expect that
>>> you're correct, and we'll also need to send explicit SIGKILLs to those
>>> processes still in R (in fact, probably just to all processes still in the
>>> cgroup). Review incoming.
>>>
>>>
>>> - Benjamin
>>>
>>>
>>> -----------------------------------------------------------
>>> This is an automatically generated e-mail. To reply, visit:
>>> https://reviews.apache.org/r/7203/#review11794
>>> -----------------------------------------------------------
>>>
>>>
>>> On Sept. 21, 2012, 2:02 a.m., Benjamin Hindman wrote:
>>> >
>>> > -----------------------------------------------------------
>>> > This is an automatically generated e-mail. To reply, visit:
>>> > https://reviews.apache.org/r/7203/
>>> > -----------------------------------------------------------
>>> >
>>> > (Updated Sept. 21, 2012, 2:02 a.m.)
>>> >
>>> >
>>> > Review request for mesos, Vinod Kone, Brian Wickman, and Jie Yu.
>>> >
>>> >
>>> > Description
>>> > -------
>>> >
>>> > See summary and
>>> http://www.kernel.org/doc/Documentation/cgroups/freezer-subsystem.txt:
>>> >
>>> > It's important to note that freezing can be incomplete. In that case
>>> we return
>>> > EBUSY. This means that some tasks in the cgroup are busy doing
>>> something that
>>> > prevents us from completely freezing the cgroup at this time. After
>>> EBUSY,
>>> > the cgroup will remain partially frozen -- reflected by freezer.state
>>> reporting
>>> > "FREEZING" when read. The state will remain "FREEZING" until one of
>>> these
>>> > things happens:
>>> >
>>> >       1) Userspace cancels the freezing operation by writing "THAWED"
>>> to
>>> >               the freezer.state file
>>> >       2) Userspace retries the freezing operation by writing "FROZEN"
>>> to
>>> >               the freezer.state file (writing "FREEZING" is not legal
>>> >               and returns EINVAL)
>>> >       3) The tasks that blocked the cgroup from entering the "FROZEN"
>>> >               state disappear from the cgroup's set of tasks.
>>> >
>>> >
>>> > Diffs
>>> > -----
>>> >
>>> >   src/linux/cgroups.cpp 4efd06e
>>> >
>>> > Diff: https://reviews.apache.org/r/7203/diff/
>>> >
>>> >
>>> > Testing
>>> > -------
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > Benjamin Hindman
>>> >
>>> >
>>>
>>>
>>
>

Reply via email to