Seems ok, I'm surprised a FROZEN loop doesn't work.

It would be interesting to have some introspection on how many iterations
this takes in practice, I guess this could be done with some unix-fu on the
logs.

On Wed, Oct 3, 2012 at 9:31 AM, Benjamin Hindman <[email protected]> wrote:

> I think the best we can do here is _try_ and send a SIGKILL to all
> processes (in R or T or S or whatever) after we write FROZEN to
> freezer.state and we find out everything is still in FREEZING. We'll
> continue to write FROZEN to freezer.state after the interval AND we'll
> continue to try sending SIGKILL. Hopefully these two mechanisms will
> _eventually_ get everything cleaned up.
>
> How does that sound?
>
>
>
>
> On Sat, Sep 22, 2012 at 10:09 AM, Jie Yu <[email protected]> wrote:
>
> > Also, I don't understand what you mean here? Could you elaborate?
> >
> >
> > If you have two running process to kill, you cannot send a SIGKILL to
> them
> > atomically. As a result, one proces will be killed first (likely), and
> the
> > other process is still making progress (though in a very short interval).
> > That may cause unpredictable errors.
> >
> > - Jie
> >
> > On Sat, Sep 22, 2012 at 2:03 AM, Vinod Kone <[email protected]> wrote:
> >
> >> Thanks for digging up the kernel code Jie! Its fascinating.
> >>
> >>
> >>> Will that cause potential problems if there are more than 1 process in
> >>> 'R' because the kill is not atomic.
> >>>
> >>>
> >> Also, I don't understand what you mean here? Could you elaborate?
> >>
> >>
> >> Vinod
> >>
> >>
> >>> - Jie
> >>>
> >>> On Fri, Sep 21, 2012 at 9:10 PM, Benjamin Hindman <[email protected]
> >wrote:
> >>>
> >>>>
> >>>>
> >>>> > On Sept. 21, 2012, 7 p.m., Vinod Kone wrote:
> >>>> > > lgtm. i've a feeling we need to also do a force kill. but we can
> do
> >>>> this after we see how brian's test pans out.
> >>>>
> >>>> I tried just setting FREEZING to the cgroup freezer.state manually and
> >>>> that didn't seem to work. Meanwhile, I sent a SIGKILL to the process
> in the
> >>>> cgroup still in R, and that got everything to cleanup. So I expect
> that
> >>>> you're correct, and we'll also need to send explicit SIGKILLs to those
> >>>> processes still in R (in fact, probably just to all processes still
> in the
> >>>> cgroup). Review incoming.
> >>>>
> >>>>
> >>>> - Benjamin
> >>>>
> >>>>
> >>>> -----------------------------------------------------------
> >>>>
> >>>> This is an automatically generated e-mail. To reply, visit:
> >>>> https://reviews.apache.org/r/7203/#review11794
> >>>>
> >>>> -----------------------------------------------------------
> >>>>
> >>>>
> >>>> On Sept. 21, 2012, 2:02 a.m., Benjamin Hindman wrote:
> >>>> >
> >>>> > -----------------------------------------------------------
> >>>> > This is an automatically generated e-mail. To reply, visit:
> >>>> > https://reviews.apache.org/r/7203/
> >>>> > -----------------------------------------------------------
> >>>> >
> >>>> > (Updated Sept. 21, 2012, 2:02 a.m.)
> >>>>
> >>>> >
> >>>> >
> >>>> > Review request for mesos, Vinod Kone, Brian Wickman, and Jie Yu.
> >>>> >
> >>>> >
> >>>> > Description
> >>>> > -------
> >>>>
> >>>> >
> >>>> > See summary and
> >>>> http://www.kernel.org/doc/Documentation/cgroups/freezer-subsystem.txt
> :
> >>>> >
> >>>> > It's important to note that freezing can be incomplete. In that case
> >>>> we return
> >>>> > EBUSY. This means that some tasks in the cgroup are busy doing
> >>>> something that
> >>>> > prevents us from completely freezing the cgroup at this time. After
> >>>> EBUSY,
> >>>> > the cgroup will remain partially frozen -- reflected by
> freezer.state
> >>>> reporting
> >>>> > "FREEZING" when read. The state will remain "FREEZING" until one of
> >>>> these
> >>>> > things happens:
> >>>> >
> >>>> >       1) Userspace cancels the freezing operation by writing
> "THAWED"
> >>>> to
> >>>> >               the freezer.state file
> >>>> >       2) Userspace retries the freezing operation by writing
> "FROZEN"
> >>>> to
> >>>> >               the freezer.state file (writing "FREEZING" is not
> legal
> >>>> >               and returns EINVAL)
> >>>> >       3) The tasks that blocked the cgroup from entering the
> "FROZEN"
> >>>> >               state disappear from the cgroup's set of tasks.
> >>>> >
> >>>> >
> >>>> > Diffs
> >>>> > -----
> >>>> >
> >>>> >   src/linux/cgroups.cpp 4efd06e
> >>>> >
> >>>> > Diff: https://reviews.apache.org/r/7203/diff/
> >>>> >
> >>>> >
> >>>> > Testing
> >>>> > -------
> >>>> >
> >>>> >
> >>>> > Thanks,
> >>>> >
> >>>> > Benjamin Hindman
> >>>> >
> >>>> >
> >>>>
> >>>>
> >>>
> >>
> >
>

Reply via email to