Hello IIya, Looks like the Vmem usage is going above the above 2.1 of Pmem times thats why the container is getting killed,
1.0 GB of 1 GB physical memory used; *3.4 GB of 2.1 GB virtual memory used* By default Vmem is set to 2.1 times of the Pmem. Looks like your job is taking 3.4GB! You can change the ratio by setting in Yarn-site.xml: yarn.nodemanager.vmem-pmem-ratio You can optionally disable this check by setting following to false: yarn.nodemanager.vmem-check-enabled Thanks, -Manoj On Wed, Sep 23, 2015 at 12:36 AM, Ilya Karpov <[email protected]> wrote: > Great thanks for your reply! > > >1. Which version of Hadoop/ YARN ? > Hadoop(command: hadoop version): > Hadoop 2.6.0-cdh5.4.5 > Subversion http://github.com/cloudera/hadoop -r > ab14c89fe25e9fb3f9de4fb852c21365b7c5608b > Compiled by jenkins on 2015-08-12T21:11Z > Compiled with protoc 2.5.0 > From source with checksum d31cb7e46b8602edaf68d335b785ab > This command was run using > /opt/cloudera/parcels/CDH-5.4.5-1.cdh5.4.5.p0.7/jars/hadoop-common-2.6.0-cdh5.4.5.jar > Yarn (command: yarn version) prints exactly the same. > > >2. From the logs is it getting killed due to over usage of Vmem or > Physical memory ? > Because of over usage of Physical memory. Last seconds of life: > 2015-09-21 22:50:34,017 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: > Memory usage of ProcessTree 13982 for container-id > container_1442402147223_0165_01_000001: 1.0 GB of 1 GB physical memory > used; 3.4 GB of 2.1 GB virtual memory used > 2015-09-21 22:50:34,017 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: > Process tree for container: container_1442402147223_0165_01_000001 has > processes older than 1 iteration running over the configured limit. > Limit=1073741824, current usage = 1074352128 > 2015-09-21 22:50:34,018 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: > Container [pid=13982,containerID=container_1442402147223_0165_01_000001] is > running beyond physical memory limits. Current usage: 1.0 GB of 1 GB > physical memory used; 3.4 GB of 2.1 GB virtual memory used. Killing > container. > Dump of the process-tree for container_1442402147223_0165_01_000001 : > |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) > SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE > |- 13994 13982 13982 13982 (java) 4285 714 3602911232 261607 > /opt/jdk1.8.0_60/bin/java -Dlog4j.configuration=container-log4j.properties > -Dyarn.app.container.log.dir=/var/log/hadoop-yarn/contai > ner/application_1442402147223_0165/container_1442402147223_0165_01_000001 > -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA > -Djava.net.preferIPv4Stack=true -Xmx825955249 org.apache.had > oop.mapreduce.v2.app.MRAppMaster > |- 13982 13980 13982 13982 (bash) 0 0 14020608 686 /bin/bash -c > /opt/jdk1.8.0_60/bin/java -Dlog4j.configuration=container-log4j.properties > -Dyarn.app.container.log.dir=/var/log/hadoop-yarn/container/application_1442402147223_0165/container_1442402147223_0165_01_000001 > -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA > -Djava.net.preferIPv4Stack=true -Xmx825955249 > org.apache.hadoop.mapreduce.v2.app.MRAppMaster > 1>/var/log/hadoop-yarn/container/application_1442402147223_0165/container_1442402147223_0165_01_000001/stdout > 2>/var/log/hadoop-yarn/container/application_1442402147223_0165/container_1442402147223_0165_01_000001/stderr > > 2015-09-21 22:50:34,018 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: > Removed ProcessTree with root 13982 > 2015-09-21 22:50:34,025 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: > Container container_1442402147223_0165_01_000001 transitioned from RUNNING > to KILLING > 2015-09-21 22:50:34,025 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: > Cleaning up container container_1442402147223_0165_01_000001 > 2015-09-21 22:50:34,075 WARN > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exit > code from container container_1442402147223_0165_01_000001 is : 143 > > >3. Can you run " jmap -histo -F <PID of AM container>" and share the heap > dump result? > I’ll try to do it asap. > > >4. If possible can you pastebin the AM logs? > yes, > https://drive.google.com/file/d/0B1DPTV7TbcO0cEEwSDZyUnBWUEk/view?usp=sharing > > > > > > 23 сент. 2015 г., в 7:21, Naganarasimha G R (Naga) < > [email protected]> написал(а): > > > > Hi Ilya, > > In a normal case AM memory requirement should not be more than the > default for small sized jobs, but seems to be something erroneous in your > case, Would like to have more information : > > 1. Which version of Hadoop/ YARN ? > > 2. From the logs is it getting killed due to over usage of Vmem or > Physical memory ? > > 3. Can you run " jmap -histo -F <PID of AM container>" and share the > heap dump result? > > 4. If possible can you pastebin the AM logs? > > > > + Naga > > ________________________________________ > > From: Ilya Karpov [[email protected]] > > Sent: Tuesday, September 22, 2015 21:06 > > To: [email protected] > > Subject: Why would ApplicationManager request RAM more that defaut 1GB? > > > > Hi all, > > can’t figure out subj. > > On my hadoop cluster I have an issue when ApplicationMaster(AM) killed > by NodeManager because AM tries to allocate more than default 1GB. MR > application, that AM is in charge of, is a mapper only job(1(!) mapper, no > reducers, downloads data from remote source). At the moment when AM killed, > MR job is ok (uses about 70% of ram limit). MR job doesn't have any custom > counters, distributes caches etc, just downloads data (by portions) via > custom input format. To fix this issue, I raised memory limit for AM, but I > want to know what is the reason of eating 1GB (!) for a trivial job like > mine? > > > > -- --Manoj Kumar M
