Re: group memory limits are always 'soft' . how do I ensure info->pid.isNone() ?

Dick Davies Tue, 28 Apr 2015 10:42:16 -0700

That's what led me into reading the code - neither mem.limit_in_bytes
or mem.memsw.limit_in_bytes
are ever set down from the (insanely high) defaults. I know that
second conditional is false, so the first
must be too, right?


It's likely I'm reading the wrong branch; we're running the 0.21.0
release - but I don't see any commits
that would change this ordering.

Just to confirm - we are using the default containerizer (not docker
or anything else) - that shouldn't make
any difference though, should it?

I'm offsite til morning now (UK time), but I'll post the full slave
logs when I can get to them.

On 28 April 2015 at 18:18, Ian Downes <[email protected]> wrote:
> The control flow in the Mesos containerizer to launch a container is:
>
> 1. Call prepare() on each isolator
> 2. Then fork the executor
> 3. Then isolate(executor_pid) on each isolator
>
> The last part of (1) will also call Isolator::update() to set the initial
> memory limits (see line 288). This is done *before* the executor is in the
> cgroup, i.e., info->pid.isNone() will be true and that block of code should
> *always* be executed when a container starts. The LOG(INFO) line at 393
> should be present in your logs. Can you verify this? It should be shortly
> after the LOG(INFO) on line 358.
>
> Ian
>
>
> On Tue, Apr 28, 2015 at 9:54 AM, Dick Davies <[email protected]> wrote:
>>
>> Thanks Ian.
>>
>> Digging around the cgroup there are 3 processes in there;
>>
>> * the mesos-executor
>> * the shell script marathon starts the app with
>> * the actual command to run the task ( a perl app in this case)
>>
>> The line of code you mention is never run in our case, because it's
>> wrapped in the conditional
>> I'm talking about!
>>
>> All I see is cpu.shares being set and then mem.soft_limit_in_bytes.
>>
>>
>> On 28 April 2015 at 17:47, Ian Downes <[email protected]> wrote:
>> > The line of code you cite is so the hard limit is not decreased on a
>> > running
>> > container because we can't (easily) reclaim anonymous memory from
>> > running
>> > processes. See the comment above the code.
>> >
>> > The info->pid.isNone() is for when cgroup is being configured (see the
>> > update() call at the end of MemIsolatorProcess::prepare()), i.e., before
>> > any
>> > processes are added to the cgroup.
>> >
>> > The limit > currentLimit.get() ensures the limit is only increased.
>> >
>> > The memory limit defaults to the maximum for the data type, I guess
>> > that's
>> > the ridiculous 8 EB. It should be set to what the initial memory
>> > allocation
>> > was for the container so this is not expected. Can you look in the slave
>> > logs for when the container was created for the log line on:
>> >
>> > https://github.com/apache/mesos/blob/master/src/slave/containerizer/isolators/cgroups/mem.cpp#L393
>> >
>> > Ian
>> >
>> > On Tue, Apr 28, 2015 at 7:42 AM, Dick Davies <[email protected]>
>> > wrote:
>> >>
>> >> Been banging my head against this  for a while now.
>> >>
>> >> mesos 0.21.0 , marathon 0.7.5, centos 6 servers.
>> >>
>> >> When I enable cgroups (flags are : --cgroups_limit_swap
>> >> --isolation=cgroups/cpu,groups/mem ) the memory limits I'm setting
>> >> are reflected in memory.soft_limit_in_bytes but not in
>> >>
>> >> memory.limit_in_bytes or memory.memsw.limit_in_bytes.
>> >>
>> >>
>> >> Upshot is our runaway task eats all RAM and swap on the server
>> >> until the OOM steps in and starts firing into the crowd.
>> >>
>> >> This line of code seems to never lower a hard limit:
>> >>
>> >>
>> >>
>> >> https://github.com/apache/mesos/blob/master/src/slave/containerizer/isolators/cgroups/mem.cpp#L382
>> >>
>> >> which means both of those tests must be true, right?
>> >>
>> >> the current limit is insanely high (8192 PB if i'm reading it right) -
>> >> how
>> >> would
>> >> I make info->pid.isNone() be true ?
>> >>
>> >> Have tried restarting the slave, scaling the marathon apps to 0 tasks
>> >> then back. Bit stumped.
>> >
>> >
>
>

Re: group memory limits are always 'soft' . how do I ensure info->pid.isNone() ?

Reply via email to