Re: Review Request: Updated cgroup freezer to retry after failed attempts (rather than just waiting indefinitely).

Jie Yu Sat, 22 Sep 2012 10:09:32 -0700

>
> Also, I don't understand what you mean here? Could you elaborate?



If you have two running process to kill, you cannot send a SIGKILL to them
atomically. As a result, one proces will be killed first (likely), and the
other process is still making progress (though in a very short interval).
That may cause unpredictable errors.

- Jie

On Sat, Sep 22, 2012 at 2:03 AM, Vinod Kone <[email protected]> wrote:

> Thanks for digging up the kernel code Jie! Its fascinating.
>
>
>> Will that cause potential problems if there are more than 1 process in
>> 'R' because the kill is not atomic.
>>
>>
> Also, I don't understand what you mean here? Could you elaborate?
>
>
> Vinod
>
>
>> - Jie
>>
>> On Fri, Sep 21, 2012 at 9:10 PM, Benjamin Hindman <[email protected]>wrote:
>>
>>>
>>>
>>> > On Sept. 21, 2012, 7 p.m., Vinod Kone wrote:
>>> > > lgtm. i've a feeling we need to also do a force kill. but we can do
>>> this after we see how brian's test pans out.
>>>
>>> I tried just setting FREEZING to the cgroup freezer.state manually and
>>> that didn't seem to work. Meanwhile, I sent a SIGKILL to the process in the
>>> cgroup still in R, and that got everything to cleanup. So I expect that
>>> you're correct, and we'll also need to send explicit SIGKILLs to those
>>> processes still in R (in fact, probably just to all processes still in the
>>> cgroup). Review incoming.
>>>
>>>
>>> - Benjamin
>>>
>>>
>>> -----------------------------------------------------------
>>>
>>> This is an automatically generated e-mail. To reply, visit:
>>> https://reviews.apache.org/r/7203/#review11794
>>>
>>> -----------------------------------------------------------
>>>
>>>
>>> On Sept. 21, 2012, 2:02 a.m., Benjamin Hindman wrote:
>>> >
>>> > -----------------------------------------------------------
>>> > This is an automatically generated e-mail. To reply, visit:
>>> > https://reviews.apache.org/r/7203/
>>> > -----------------------------------------------------------
>>> >
>>> > (Updated Sept. 21, 2012, 2:02 a.m.)
>>>
>>> >
>>> >
>>> > Review request for mesos, Vinod Kone, Brian Wickman, and Jie Yu.
>>> >
>>> >
>>> > Description
>>> > -------
>>>
>>> >
>>> > See summary and
>>> http://www.kernel.org/doc/Documentation/cgroups/freezer-subsystem.txt:
>>> >
>>> > It's important to note that freezing can be incomplete. In that case
>>> we return
>>> > EBUSY. This means that some tasks in the cgroup are busy doing
>>> something that
>>> > prevents us from completely freezing the cgroup at this time. After
>>> EBUSY,
>>> > the cgroup will remain partially frozen -- reflected by freezer.state
>>> reporting
>>> > "FREEZING" when read. The state will remain "FREEZING" until one of
>>> these
>>> > things happens:
>>> >
>>> >       1) Userspace cancels the freezing operation by writing "THAWED"
>>> to
>>> >               the freezer.state file
>>> >       2) Userspace retries the freezing operation by writing "FROZEN"
>>> to
>>> >               the freezer.state file (writing "FREEZING" is not legal
>>> >               and returns EINVAL)
>>> >       3) The tasks that blocked the cgroup from entering the "FROZEN"
>>> >               state disappear from the cgroup's set of tasks.
>>> >
>>> >
>>> > Diffs
>>> > -----
>>> >
>>> >   src/linux/cgroups.cpp 4efd06e
>>> >
>>> > Diff: https://reviews.apache.org/r/7203/diff/
>>> >
>>> >
>>> > Testing
>>> > -------
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > Benjamin Hindman
>>> >
>>> >
>>>
>>>
>>
>

Re: Review Request: Updated cgroup freezer to retry after failed attempts (rather than just waiting indefinitely).

Reply via email to