sgtm @vinodkone
On Wed, Oct 3, 2012 at 9:31 AM, Benjamin Hindman <[email protected]> wrote: > I think the best we can do here is _try_ and send a SIGKILL to all > processes (in R or T or S or whatever) after we write FROZEN to > freezer.state and we find out everything is still in FREEZING. We'll > continue to write FROZEN to freezer.state after the interval AND we'll > continue to try sending SIGKILL. Hopefully these two mechanisms will > _eventually_ get everything cleaned up. > > How does that sound? > > > > > > On Sat, Sep 22, 2012 at 10:09 AM, Jie Yu <[email protected]> wrote: > >> Also, I don't understand what you mean here? Could you elaborate? >> >> >> If you have two running process to kill, you cannot send a SIGKILL to >> them atomically. As a result, one proces will be killed first (likely), and >> the other process is still making progress (though in a very short >> interval). That may cause unpredictable errors. >> >> - Jie >> >> On Sat, Sep 22, 2012 at 2:03 AM, Vinod Kone <[email protected]> wrote: >> >>> Thanks for digging up the kernel code Jie! Its fascinating. >>> >>> >>>> Will that cause potential problems if there are more than 1 process in >>>> 'R' because the kill is not atomic. >>>> >>>> >>> Also, I don't understand what you mean here? Could you elaborate? >>> >>> >>> Vinod >>> >>> >>>> - Jie >>>> >>>> On Fri, Sep 21, 2012 at 9:10 PM, Benjamin Hindman <[email protected]>wrote: >>>> >>>>> >>>>> >>>>> > On Sept. 21, 2012, 7 p.m., Vinod Kone wrote: >>>>> > > lgtm. i've a feeling we need to also do a force kill. but we can >>>>> do this after we see how brian's test pans out. >>>>> >>>>> I tried just setting FREEZING to the cgroup freezer.state manually and >>>>> that didn't seem to work. Meanwhile, I sent a SIGKILL to the process in >>>>> the >>>>> cgroup still in R, and that got everything to cleanup. So I expect that >>>>> you're correct, and we'll also need to send explicit SIGKILLs to those >>>>> processes still in R (in fact, probably just to all processes still in the >>>>> cgroup). Review incoming. >>>>> >>>>> >>>>> - Benjamin >>>>> >>>>> >>>>> ----------------------------------------------------------- >>>>> >>>>> This is an automatically generated e-mail. To reply, visit: >>>>> https://reviews.apache.org/r/7203/#review11794 >>>>> >>>>> ----------------------------------------------------------- >>>>> >>>>> >>>>> On Sept. 21, 2012, 2:02 a.m., Benjamin Hindman wrote: >>>>> > >>>>> > ----------------------------------------------------------- >>>>> > This is an automatically generated e-mail. To reply, visit: >>>>> > https://reviews.apache.org/r/7203/ >>>>> > ----------------------------------------------------------- >>>>> > >>>>> > (Updated Sept. 21, 2012, 2:02 a.m.) >>>>> >>>>> > >>>>> > >>>>> > Review request for mesos, Vinod Kone, Brian Wickman, and Jie Yu. >>>>> > >>>>> > >>>>> > Description >>>>> > ------- >>>>> >>>>> > >>>>> > See summary and >>>>> http://www.kernel.org/doc/Documentation/cgroups/freezer-subsystem.txt: >>>>> > >>>>> > It's important to note that freezing can be incomplete. In that case >>>>> we return >>>>> > EBUSY. This means that some tasks in the cgroup are busy doing >>>>> something that >>>>> > prevents us from completely freezing the cgroup at this time. After >>>>> EBUSY, >>>>> > the cgroup will remain partially frozen -- reflected by >>>>> freezer.state reporting >>>>> > "FREEZING" when read. The state will remain "FREEZING" until one of >>>>> these >>>>> > things happens: >>>>> > >>>>> > 1) Userspace cancels the freezing operation by writing >>>>> "THAWED" to >>>>> > the freezer.state file >>>>> > 2) Userspace retries the freezing operation by writing >>>>> "FROZEN" to >>>>> > the freezer.state file (writing "FREEZING" is not legal >>>>> > and returns EINVAL) >>>>> > 3) The tasks that blocked the cgroup from entering the "FROZEN" >>>>> > state disappear from the cgroup's set of tasks. >>>>> > >>>>> > >>>>> > Diffs >>>>> > ----- >>>>> > >>>>> > src/linux/cgroups.cpp 4efd06e >>>>> > >>>>> > Diff: https://reviews.apache.org/r/7203/diff/ >>>>> > >>>>> > >>>>> > Testing >>>>> > ------- >>>>> > >>>>> > >>>>> > Thanks, >>>>> > >>>>> > Benjamin Hindman >>>>> > >>>>> > >>>>> >>>>> >>>> >>> >> >
