1. Easy way to reproduce container to exceed configured physical memory
limit is by configuring the Heap Size (500MB) of
container above the Container Size (100MB).

yarn-site.xml:  yarn.scheduler.minimum-allocation-mb 100
mapred-site.xml: yarn.app.mapreduce.am.resource.mb 100
                            yarn.app.mapreduce.am.command-opts -Xmx500m

Note: This is only for testing purpose. Usually the Heap Size has to be 80%
of Container Size.

2. There is no job settings which increase the memory usage of a container.
It depends on the application code.
Try adding memory intensive code inside the MapReduce application.

https://alvinalexander.com/blog/post/java/java-program-consume-all-memory-ram-on-computer
https://github.com/apache/hadoop/blob/a55d6bba71c81c1c4e9d8cd11f55c78f10a548b0/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/QuasiMonteCarlo.java

Running pi job on a long number will also require huge memory.

yarn jar
/opt/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.1.jar pi 1
1000000000

There are chances that the JVM Crashes with OutOfMemory before Yarn kills
the container for exceeding memory usage.


On Thu, Aug 15, 2019 at 10:51 PM . . <writeme...@googlemail.com> wrote:

>
> Prabhu,
>
> I reformulate my question:
>
> I successfully run following job: yarn jar
> /opt/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.1.jar pi 3
> 10
>
> and noticed that highest node physical memory usage was alway <512MB
> during job duration; else job completed (see details below)
>
> quote....
> Every 2.0s: yarn node -status hadoop-1.mydomain.local:44718
>
> 19/08/15 19:10:54 INFO client.RMProxy: Connecting to ResourceManager at
> hadoop-1.mydomain.local/192.168.100.11:8032
> Node Report :
>         Node-Id : hadoop-1.mydomain.local:44718
>         Rack : /default-rack
>         Node-State : RUNNING
>         Node-Http-Address : hadoop-1.mydomain.local:8042
>         Last-Health-Update : Thu 15/Aug/19 07:10:22:75CEST
>         Health-Report :
>         Containers : 2
>         Memory-Used : 2048MB
>         Memory-Capacity : 5120MB
>         CPU-Used : 2 vcores
>         CPU-Capacity : 6 vcores
>         Node-Labels :
>         Resource Utilization by Node : *PMem:471 MB*, VMem:1413 MB,
> VCores:0.80463576
>         Resource Utilization by Containers : PMem:110 MB, VMem:4014 MB,
> VCores:0.97300005
> ...unquote
>
> My question is : which job setting may I use to force a node physical
> memory usage >512MB and force a job kill due (or thanks) pmem check.
> Hope above better explain my question ;)
>
> thanks/Guido
>
>
> On Thu, Aug 15, 2019 at 5:09 PM Prabhu Josephraj
> <pjos...@cloudera.com.invalid> wrote:
>
>> Jeff, Available node size for YARN is the value of
>> yarn.nodemanager.resource.memory-mb which is set ten times of 512MB.
>>
>> Guido, Did not get the below question, can you explain the same.
>>
>>        Are you aware of any job syntax to tune the 'container physical
>> memory usage' to 'force' job kill/log?
>>
>>
>> On Thu, Aug 15, 2019 at 7:20 PM . . <writeme...@googlemail.com.invalid>
>> wrote:
>>
>>> Hi Prabhu,
>>>
>>> thanks for your explanation. It makes sense, but I wonder YARN allows
>>> you to define  'yarn.nodemanager.resource.memory-mb' higher then node
>>> physical memory w/out logging any entry under resourcemanager log.
>>>
>>> Are you aware of any job syntax to tune the 'container physical memory
>>> usage' to 'force' job kill/log?
>>>
>>> thanks/Guido
>>>
>>>
>>>
>>> On Thu, Aug 15, 2019 at 1:50 PM Prabhu Josephraj
>>> <pjos...@cloudera.com.invalid> wrote:
>>>
>>>> YARN allocates based on the configuration
>>>> (yarn.nodemanager.resource.memory-mb) user has configured. It has allocated
>>>> the AM Container of size 1536MB as it can fit in 5120MB Available Node
>>>> Size.
>>>>
>>>> yarn.nodemanager.pmem-check-enabled will kill the container if the
>>>> physical memory usage of the container process is above
>>>> 1536MB. MR ApplicationMaster for a pi job is light weight and it won't
>>>> require that much memory and so not got killed.
>>>>
>>>>
>>>>
>>>> On Thu, Aug 15, 2019 at 4:02 PM . . <writeme...@googlemail.com.invalid>
>>>> wrote:
>>>>
>>>>> Correct:  I set 'yarn.nodemanager.resource.memory-mb' ten times the
>>>>> node physical memory (512MB) and I was able to successfully execute a  'pi
>>>>> 1 10' mapreduce job.
>>>>>
>>>>> Since default 'yarn.app.mapreduce.am.resource.mb' value is 1536MB I
>>>>> expected the job to never start / be allocated and I have no valid
>>>>> explanation.
>>>>>
>>>>>
>>>>> On Wed, Aug 14, 2019 at 10:32 PM . . <writeme...@googlemail.com>
>>>>> wrote:
>>>>>
>>>>>> Correct:  I set 'yarn.nodemanager.resource.memory-mb' ten times the
>>>>>> node physical memory (512MB) and I was able to successfully execute a  
>>>>>> 'pi
>>>>>> 1 10' mapreduce job.
>>>>>>
>>>>>> Since default 'yarn.app.mapreduce.am.resource.mb' value is 1536MB I
>>>>>> expected the job to never start / be allocated and I have no valid
>>>>>> explanation.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Aug 14, 2019 at 8:31 PM Jeff Hubbs <jhubbsl...@att.net>
>>>>>> wrote:
>>>>>>
>>>>>>> To make sure I understand...you've allocated *ten times* your
>>>>>>> physical RAM for containers? If so, I think that's your issue.
>>>>>>>
>>>>>>> For reference, under Hadoop 3.x I didn't have a cluster that would
>>>>>>> really do anything until its worker nodes had at least 8GiB.
>>>>>>>
>>>>>>> On 8/14/19 12:10 PM, . . wrote:
>>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> I installed a basic 3 nodes Hadoop 2.9.1 cluster and playing with
>>>>>>> YARN settings.
>>>>>>> The 3 nodes has following configuration:
>>>>>>> 1 cpu / 1 core?? / 512MB RAM
>>>>>>>
>>>>>>> I wonder I was able to configure yarn-site.xml with following
>>>>>>> settings (higher than node physical limits) and successfully run a
>>>>>>> mapreduce 'pi 1 10' job
>>>>>>>
>>>>>>> quote...
>>>>>>> ?? <property>
>>>>>>> ?? ?? ??
>>>>>>> <name>yarn.resourcemanager.scheduler.class</name><value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
>>>>>>> </property>
>>>>>>>
>>>>>>> ?? ?? <property>
>>>>>>> ?? ?? ?? ?? <name>yarn.nodemanager.resource.memory-mb</name>
>>>>>>> ?? ?? ?? ?? <value>5120</value>
>>>>>>> ?? ?? ?? ?? <description>Amount of physical memory, in MB, that can
>>>>>>> be allocated for containers. If set to -1 and
>>>>>>> yarn.nodemanager.resource.detect-hardware-capabilities is true, it is
>>>>>>> automatically calculated. In other cases, the default is
>>>>>>> 8192MB</description>
>>>>>>> ?? ?? </property>
>>>>>>>
>>>>>>> ?? ?? <property>
>>>>>>> ?? ?? ?? ?? <name>yarn.nodemanager.resource.cpu-vcores</name>
>>>>>>> ?? ?? ?? ?? <value>6</value>
>>>>>>> ?? ?? ?? ?? <description>Number of CPU cores that can be allocated
>>>>>>> for containers.</description>
>>>>>>> ?? ?? </property>
>>>>>>> ...unquote
>>>>>>>
>>>>>>> Can anyone provide an explanation please?
>>>>>>>
>>>>>>> Should 'yarn.nodemanager.vmem-check-enabled' and
>>>>>>> 'yarn.nodemanager.pmem-check-enabled' properties (set to 'true' as 
>>>>>>> default)
>>>>>>> check that my YARN settings are higher than physical limits?
>>>>>>>
>>>>>>> Which mapreduce 'pi' job settings can I use, to 'force' containers
>>>>>>> to use more than node physical resources?
>>>>>>>
>>>>>>> Many thanks in advance!
>>>>>>> Guido
>>>>>>>
>>>>>>>
>>>>>>>

Reply via email to