Re: Short peaks in container memory usage

2016-08-17 Thread Karthik Kambatla
Jan,

As part of YARN-1011 (oversubscription work), we are looking at better
(faster) ways of monitoring and enforcement and considering putting all
YARN containers under a cgroup with hard limit so YARN as a whole does not
go over a limit, but let the individual containers run over. The details
are not clear yet, but hopefully that will help you.

On Tue, Aug 16, 2016 at 12:53 AM, Jan Lukavský  wrote:

> Hi Ravi,
>
> sorry for late answer. :) We are on hadoop 2.6-cdh5.7.
>
> Cheers,
>  Jan
>
> On 12.8.2016 01:57, Ravi Prakash wrote:
>
>> Hi Jan!
>>
>> Yes! Makes sense. I'm sure there were bigger changes for the
>> ResourceHandler. Which version are you on?
>>
>> Cheers
>> Ravi
>>
>> On Thu, Aug 11, 2016 at 7:48 AM, Jan Lukavský <
>> jan.lukav...@firma.seznam.cz >
>> wrote:
>>
>> Hi Ravi,
>>
>> I don't think cgroups will help us, because, we don't want to
>> impose a hard limit on the memory usage, we just want to allow for
>> short time periods, when container can consume more memory than
>> its limit. We don't want to put the limit too high, because that
>> causes underutilization of our cluster, but setting it
>> "reasonable" causes applications to fail (because of random
>> containers being killed because of spikes). That's why we created
>> the time-window averaging resource calculator, and I was trying to
>> find out, if anybody else is having similar kind of issues. If so,
>> I could contribute our extension (and therefore we will not have
>> to maintain it ourselves in a separate repository :)). The
>> resource calculator is for hadoop 2.6, and I suppose there might
>> be larger changes around this in higher versions?
>>
>> Cheers,
>>  Jan
>>
>> On 10.8.2016 19:23, Ravi Prakash wrote:
>>
>>> Hi Jan!
>>>
>>> Thanks for your explanation. I'm glad that works for you! :-)
>>> https://issues.apache.org/jira/browse/YARN-5202
>>>  is something
>>> that Yahoo! talked about at the Hadoop Summit, (and it seems the
>>> community may be going in a similar direction, although not
>>> exactly the same.) There's also
>>> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-proj
>>> ect/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-
>>> nodemanager/src/main/java/org/apache/hadoop/yarn/server/
>>> nodemanager/containermanager/linux/resources/CGroupsHandler.java
>>> >> ject/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-
>>> nodemanager/src/main/java/org/apache/hadoop/yarn/server/
>>> nodemanager/containermanager/linux/resources/CGroupsHandler.java>
>>> . Ideally at my company we'd like memory limits also to be
>>> imposed by Cgroups because we have had the OOM-killer wreak havoc
>>> a couple of times, but from what I know, that is not an option yet.
>>>
>>> Cheers
>>> Ravi
>>>
>>> On Wed, Aug 10, 2016 at 1:54 AM, Jan Lukavský
>>> >> > wrote:
>>>
>>> Hi Ravi,
>>>
>>> we don't run into situation where memory used > RAM, because
>>> memory configured to be used by all containers on a node is
>>> less than the total amount on memory (by a factor of say
>>> 10%). The spikes of container memory usage, that are
>>> tolerated due to the averaging don't happen on all containers
>>> at once, but are more of a random nature and therefore mostly
>>> only single running container "spikes", which therefore
>>> doesn't cause any issues. To fully answer your question, we
>>> have overcommit enabled and therefore, if we would run out of
>>> memory, bad things would happen. :) We are aware of that. The
>>> risk of running into OOM-killer can be controlled by the
>>> averaging window length - as the length grows, the more and
>>> more spikes are tolerated. Setting the averaging window
>>> length to 1 switches this feature off, turning it back into
>>> the "standard" behavior, which is why I see it as a extension
>>> of the current approach, which could be interesting to other
>>> people as well.
>>>
>>>   Jan
>>>
>>>
>>> On 10.8.2016 02:48, Ravi Prakash wrote:
>>>
 Hi Jan!

 Thanks for your contribution. In your approach what happens
 when a few containers on a node are using "excessive" memory
 (so that total memory used > RAM available on the machine).
 Do you have overcommit enabled?

 Thanks
 Ravi

 On Tue, Aug 9, 2016 at 1:31 AM, Jan Lukavský
 > wrote:

 Hello 

Re: Short peaks in container memory usage

2016-08-16 Thread Jan Lukavský

Hi Ravi,

sorry for late answer. :) We are on hadoop 2.6-cdh5.7.

Cheers,
 Jan

On 12.8.2016 01:57, Ravi Prakash wrote:

Hi Jan!

Yes! Makes sense. I'm sure there were bigger changes for the 
ResourceHandler. Which version are you on?


Cheers
Ravi

On Thu, Aug 11, 2016 at 7:48 AM, Jan Lukavský 
> 
wrote:


Hi Ravi,

I don't think cgroups will help us, because, we don't want to
impose a hard limit on the memory usage, we just want to allow for
short time periods, when container can consume more memory than
its limit. We don't want to put the limit too high, because that
causes underutilization of our cluster, but setting it
"reasonable" causes applications to fail (because of random
containers being killed because of spikes). That's why we created
the time-window averaging resource calculator, and I was trying to
find out, if anybody else is having similar kind of issues. If so,
I could contribute our extension (and therefore we will not have
to maintain it ourselves in a separate repository :)). The
resource calculator is for hadoop 2.6, and I suppose there might
be larger changes around this in higher versions?

Cheers,
 Jan

On 10.8.2016 19:23, Ravi Prakash wrote:

Hi Jan!

Thanks for your explanation. I'm glad that works for you! :-)
https://issues.apache.org/jira/browse/YARN-5202
 is something
that Yahoo! talked about at the Hadoop Summit, (and it seems the
community may be going in a similar direction, although not
exactly the same.) There's also

https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/CGroupsHandler.java


. Ideally at my company we'd like memory limits also to be
imposed by Cgroups because we have had the OOM-killer wreak havoc
a couple of times, but from what I know, that is not an option yet.

Cheers
Ravi

On Wed, Aug 10, 2016 at 1:54 AM, Jan Lukavský
> wrote:

Hi Ravi,

we don't run into situation where memory used > RAM, because
memory configured to be used by all containers on a node is
less than the total amount on memory (by a factor of say
10%). The spikes of container memory usage, that are
tolerated due to the averaging don't happen on all containers
at once, but are more of a random nature and therefore mostly
only single running container "spikes", which therefore
doesn't cause any issues. To fully answer your question, we
have overcommit enabled and therefore, if we would run out of
memory, bad things would happen. :) We are aware of that. The
risk of running into OOM-killer can be controlled by the
averaging window length - as the length grows, the more and
more spikes are tolerated. Setting the averaging window
length to 1 switches this feature off, turning it back into
the "standard" behavior, which is why I see it as a extension
of the current approach, which could be interesting to other
people as well.

  Jan


On 10.8.2016 02:48, Ravi Prakash wrote:

Hi Jan!

Thanks for your contribution. In your approach what happens
when a few containers on a node are using "excessive" memory
(so that total memory used > RAM available on the machine).
Do you have overcommit enabled?

Thanks
Ravi

On Tue, Aug 9, 2016 at 1:31 AM, Jan Lukavský
> wrote:

Hello community,

I have a question about container resource calculation
in nodemanager. Some time ago a filed JIRA
https://issues.apache.org/jira/browse/YARN-4681
, which
I though might address our problems with container being
killed because of read-only mmaping memory block. The
JIRA has not been resolved yet, but it turned out for
us, that the patch doesn't solve the problem. Some
applications (namely Apache Spark) tend to allocate
really large memory blocks outside JVM heap (using mmap,
but with MAP_PRIVATE), but only for short time periods.
We solved this by creating a smoothing resource
calculator, which 

Re: Short peaks in container memory usage

2016-08-12 Thread Ravi Prakash
Hi Jan!

Yes! Makes sense. I'm sure there were bigger changes for the
ResourceHandler. Which version are you on?

Cheers
Ravi

On Thu, Aug 11, 2016 at 7:48 AM, Jan Lukavský 
wrote:

> Hi Ravi,
>
> I don't think cgroups will help us, because, we don't want to impose a
> hard limit on the memory usage, we just want to allow for short time
> periods, when container can consume more memory than its limit. We don't
> want to put the limit too high, because that causes underutilization of our
> cluster, but setting it "reasonable" causes applications to fail (because
> of random containers being killed because of spikes). That's why we created
> the time-window averaging resource calculator, and I was trying to find
> out, if anybody else is having similar kind of issues. If so, I could
> contribute our extension (and therefore we will not have to maintain it
> ourselves in a separate repository :)). The resource calculator is for
> hadoop 2.6, and I suppose there might be larger changes around this in
> higher versions?
>
> Cheers,
>  Jan
>
> On 10.8.2016 19:23, Ravi Prakash wrote:
>
> Hi Jan!
>
> Thanks for your explanation. I'm glad that works for you! :-)
> https://issues.apache.org/jira/browse/YARN-5202 is something that Yahoo!
> talked about at the Hadoop Summit, (and it seems the community may be going
> in a similar direction, although not exactly the same.) There's also
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-
> project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-
> server-nodemanager/src/main/java/org/apache/hadoop/yarn/
> server/nodemanager/containermanager/linux/resources/CGroupsHandler.java .
> Ideally at my company we'd like memory limits also to be imposed by Cgroups
> because we have had the OOM-killer wreak havoc a couple of times, but from
> what I know, that is not an option yet.
>
> Cheers
> Ravi
>
> On Wed, Aug 10, 2016 at 1:54 AM, Jan Lukavský <
> jan.lukav...@firma.seznam.cz> wrote:
>
>> Hi Ravi,
>>
>> we don't run into situation where memory used > RAM, because memory
>> configured to be used by all containers on a node is less than the total
>> amount on memory (by a factor of say 10%). The spikes of container memory
>> usage, that are tolerated due to the averaging don't happen on all
>> containers at once, but are more of a random nature and therefore mostly
>> only single running container "spikes", which therefore doesn't cause any
>> issues. To fully answer your question, we have overcommit enabled and
>> therefore, if we would run out of memory, bad things would happen. :) We
>> are aware of that. The risk of running into OOM-killer can be controlled by
>> the averaging window length - as the length grows, the more and more spikes
>> are tolerated. Setting the averaging window length to 1 switches this
>> feature off, turning it back into the "standard" behavior, which is why I
>> see it as a extension of the current approach, which could be interesting
>> to other people as well.
>>
>>   Jan
>>
>>
>> On 10.8.2016 02:48, Ravi Prakash wrote:
>>
>> Hi Jan!
>>
>> Thanks for your contribution. In your approach what happens when a few
>> containers on a node are using "excessive" memory (so that total memory
>> used > RAM available on the machine). Do you have overcommit enabled?
>>
>> Thanks
>> Ravi
>>
>> On Tue, Aug 9, 2016 at 1:31 AM, Jan Lukavský <
>> jan.lukav...@firma.seznam.cz> wrote:
>>
>>> Hello community,
>>>
>>> I have a question about container resource calculation in nodemanager.
>>> Some time ago a filed JIRA https://issues.apache.org/jira
>>> /browse/YARN-4681, which I though might address our problems with
>>> container being killed because of read-only mmaping memory block. The JIRA
>>> has not been resolved yet, but it turned out for us, that the patch doesn't
>>> solve the problem. Some applications (namely Apache Spark) tend to allocate
>>> really large memory blocks outside JVM heap (using mmap, but with
>>> MAP_PRIVATE), but only for short time periods. We solved this by creating a
>>> smoothing resource calculator, which averages the memory usage of a
>>> container over some time period (say 5 minutes). This eliminates the
>>> problem of container being killed for short memory consumption peak, but in
>>> the same time preserves the ability to kill container that *really*
>>> consumes excessive amount of memory.
>>>
>>> My question is, does this seem a systematic approach to you and should I
>>> post our patch to the community or am thinking in a wrong direction from
>>> the beginning? :)
>>>
>>>
>>> Thanks for reactions,
>>>
>>>  Jan
>>>
>>>
>>> -
>>> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
>>> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
>>>
>>>
>>
>>
>
>
> --
>
> Jan Lukavský
> Vedoucí týmu vývoje
> Seznam.cz, a.s.
> Radlická 3294/10
> 15000, Praha 5
> jan.lukavsky@firma.seznam.czhttp://www.seznam.cz
>
>


Re: Short peaks in container memory usage

2016-08-11 Thread Jan Lukavský

Hi Ravi,

I don't think cgroups will help us, because, we don't want to impose a 
hard limit on the memory usage, we just want to allow for short time 
periods, when container can consume more memory than its limit. We don't 
want to put the limit too high, because that causes underutilization of 
our cluster, but setting it "reasonable" causes applications to fail 
(because of random containers being killed because of spikes). That's 
why we created the time-window averaging resource calculator, and I was 
trying to find out, if anybody else is having similar kind of issues. If 
so, I could contribute our extension (and therefore we will not have to 
maintain it ourselves in a separate repository :)). The resource 
calculator is for hadoop 2.6, and I suppose there might be larger 
changes around this in higher versions?


Cheers,
 Jan

On 10.8.2016 19:23, Ravi Prakash wrote:

Hi Jan!

Thanks for your explanation. I'm glad that works for you! :-) 
https://issues.apache.org/jira/browse/YARN-5202 is something that 
Yahoo! talked about at the Hadoop Summit, (and it seems the community 
may be going in a similar direction, although not exactly the same.) 
There's also 
https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/CGroupsHandler.java 
. Ideally at my company we'd like memory limits also to be imposed by 
Cgroups because we have had the OOM-killer wreak havoc a couple of 
times, but from what I know, that is not an option yet.


Cheers
Ravi

On Wed, Aug 10, 2016 at 1:54 AM, Jan Lukavský 
> 
wrote:


Hi Ravi,

we don't run into situation where memory used > RAM, because
memory configured to be used by all containers on a node is less
than the total amount on memory (by a factor of say 10%). The
spikes of container memory usage, that are tolerated due to the
averaging don't happen on all containers at once, but are more of
a random nature and therefore mostly only single running container
"spikes", which therefore doesn't cause any issues. To fully
answer your question, we have overcommit enabled and therefore, if
we would run out of memory, bad things would happen. :) We are
aware of that. The risk of running into OOM-killer can be
controlled by the averaging window length - as the length grows,
the more and more spikes are tolerated. Setting the averaging
window length to 1 switches this feature off, turning it back into
the "standard" behavior, which is why I see it as a extension of
the current approach, which could be interesting to other people
as well.

  Jan


On 10.8.2016 02:48, Ravi Prakash wrote:

Hi Jan!

Thanks for your contribution. In your approach what happens when
a few containers on a node are using "excessive" memory (so that
total memory used > RAM available on the machine). Do you have
overcommit enabled?

Thanks
Ravi

On Tue, Aug 9, 2016 at 1:31 AM, Jan Lukavský
> wrote:

Hello community,

I have a question about container resource calculation in
nodemanager. Some time ago a filed JIRA
https://issues.apache.org/jira/browse/YARN-4681
, which I
though might address our problems with container being killed
because of read-only mmaping memory block. The JIRA has not
been resolved yet, but it turned out for us, that the patch
doesn't solve the problem. Some applications (namely Apache
Spark) tend to allocate really large memory blocks outside
JVM heap (using mmap, but with MAP_PRIVATE), but only for
short time periods. We solved this by creating a smoothing
resource calculator, which averages the memory usage of a
container over some time period (say 5 minutes). This
eliminates the problem of container being killed for short
memory consumption peak, but in the same time preserves the
ability to kill container that *really* consumes excessive
amount of memory.

My question is, does this seem a systematic approach to you
and should I post our patch to the community or am thinking
in a wrong direction from the beginning? :)


Thanks for reactions,

 Jan


-
To unsubscribe, e-mail:
yarn-dev-unsubscr...@hadoop.apache.org

For additional commands, e-mail:
yarn-dev-h...@hadoop.apache.org









--

Jan Lukavský
Vedoucí týmu vývoje
Seznam.cz, a.s.

Re: Short peaks in container memory usage

2016-08-10 Thread Ravi Prakash
Hi Jan!

Thanks for your explanation. I'm glad that works for you! :-)
https://issues.apache.org/jira/browse/YARN-5202 is something that Yahoo!
talked about at the Hadoop Summit, (and it seems the community may be going
in a similar direction, although not exactly the same.) There's also
https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/CGroupsHandler.java
. Ideally at my company we'd like memory limits also to be imposed by
Cgroups because we have had the OOM-killer wreak havoc a couple of times,
but from what I know, that is not an option yet.

Cheers
Ravi

On Wed, Aug 10, 2016 at 1:54 AM, Jan Lukavský 
wrote:

> Hi Ravi,
>
> we don't run into situation where memory used > RAM, because memory
> configured to be used by all containers on a node is less than the total
> amount on memory (by a factor of say 10%). The spikes of container memory
> usage, that are tolerated due to the averaging don't happen on all
> containers at once, but are more of a random nature and therefore mostly
> only single running container "spikes", which therefore doesn't cause any
> issues. To fully answer your question, we have overcommit enabled and
> therefore, if we would run out of memory, bad things would happen. :) We
> are aware of that. The risk of running into OOM-killer can be controlled by
> the averaging window length - as the length grows, the more and more spikes
> are tolerated. Setting the averaging window length to 1 switches this
> feature off, turning it back into the "standard" behavior, which is why I
> see it as a extension of the current approach, which could be interesting
> to other people as well.
>
>   Jan
>
>
> On 10.8.2016 02:48, Ravi Prakash wrote:
>
> Hi Jan!
>
> Thanks for your contribution. In your approach what happens when a few
> containers on a node are using "excessive" memory (so that total memory
> used > RAM available on the machine). Do you have overcommit enabled?
>
> Thanks
> Ravi
>
> On Tue, Aug 9, 2016 at 1:31 AM, Jan Lukavský  > wrote:
>
>> Hello community,
>>
>> I have a question about container resource calculation in nodemanager.
>> Some time ago a filed JIRA https://issues.apache.org/jira
>> /browse/YARN-4681, which I though might address our problems with
>> container being killed because of read-only mmaping memory block. The JIRA
>> has not been resolved yet, but it turned out for us, that the patch doesn't
>> solve the problem. Some applications (namely Apache Spark) tend to allocate
>> really large memory blocks outside JVM heap (using mmap, but with
>> MAP_PRIVATE), but only for short time periods. We solved this by creating a
>> smoothing resource calculator, which averages the memory usage of a
>> container over some time period (say 5 minutes). This eliminates the
>> problem of container being killed for short memory consumption peak, but in
>> the same time preserves the ability to kill container that *really*
>> consumes excessive amount of memory.
>>
>> My question is, does this seem a systematic approach to you and should I
>> post our patch to the community or am thinking in a wrong direction from
>> the beginning? :)
>>
>>
>> Thanks for reactions,
>>
>>  Jan
>>
>>
>> -
>> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
>>
>>
>
>


Re: Short peaks in container memory usage

2016-08-10 Thread Jan Lukavský

Hi Ravi,

we don't run into situation where memory used > RAM, because memory 
configured to be used by all containers on a node is less than the total 
amount on memory (by a factor of say 10%). The spikes of container 
memory usage, that are tolerated due to the averaging don't happen on 
all containers at once, but are more of a random nature and therefore 
mostly only single running container "spikes", which therefore doesn't 
cause any issues. To fully answer your question, we have overcommit 
enabled and therefore, if we would run out of memory, bad things would 
happen. :) We are aware of that. The risk of running into OOM-killer can 
be controlled by the averaging window length - as the length grows, the 
more and more spikes are tolerated. Setting the averaging window length 
to 1 switches this feature off, turning it back into the "standard" 
behavior, which is why I see it as a extension of the current approach, 
which could be interesting to other people as well.


  Jan

On 10.8.2016 02:48, Ravi Prakash wrote:

Hi Jan!

Thanks for your contribution. In your approach what happens when a few 
containers on a node are using "excessive" memory (so that total 
memory used > RAM available on the machine). Do you have overcommit 
enabled?


Thanks
Ravi

On Tue, Aug 9, 2016 at 1:31 AM, Jan Lukavský 
> 
wrote:


Hello community,

I have a question about container resource calculation in
nodemanager. Some time ago a filed JIRA
https://issues.apache.org/jira/browse/YARN-4681
, which I though
might address our problems with container being killed because of
read-only mmaping memory block. The JIRA has not been resolved
yet, but it turned out for us, that the patch doesn't solve the
problem. Some applications (namely Apache Spark) tend to allocate
really large memory blocks outside JVM heap (using mmap, but with
MAP_PRIVATE), but only for short time periods. We solved this by
creating a smoothing resource calculator, which averages the
memory usage of a container over some time period (say 5 minutes).
This eliminates the problem of container being killed for short
memory consumption peak, but in the same time preserves the
ability to kill container that *really* consumes excessive amount
of memory.

My question is, does this seem a systematic approach to you and
should I post our patch to the community or am thinking in a wrong
direction from the beginning? :)


Thanks for reactions,

 Jan


-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org

For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org







Re: Short peaks in container memory usage

2016-08-09 Thread Ravi Prakash
Hi Jan!

Thanks for your contribution. In your approach what happens when a few
containers on a node are using "excessive" memory (so that total memory
used > RAM available on the machine). Do you have overcommit enabled?

Thanks
Ravi

On Tue, Aug 9, 2016 at 1:31 AM, Jan Lukavský 
wrote:

> Hello community,
>
> I have a question about container resource calculation in nodemanager.
> Some time ago a filed JIRA https://issues.apache.org/jira/browse/YARN-4681,
> which I though might address our problems with container being killed
> because of read-only mmaping memory block. The JIRA has not been resolved
> yet, but it turned out for us, that the patch doesn't solve the problem.
> Some applications (namely Apache Spark) tend to allocate really large
> memory blocks outside JVM heap (using mmap, but with MAP_PRIVATE), but only
> for short time periods. We solved this by creating a smoothing resource
> calculator, which averages the memory usage of a container over some time
> period (say 5 minutes). This eliminates the problem of container being
> killed for short memory consumption peak, but in the same time preserves
> the ability to kill container that *really* consumes excessive amount of
> memory.
>
> My question is, does this seem a systematic approach to you and should I
> post our patch to the community or am thinking in a wrong direction from
> the beginning? :)
>
>
> Thanks for reactions,
>
>  Jan
>
>
> -
> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
>
>