Could not get it to make sense out of MALLOC_ARENA_MAX. No .bashrc etc.
no env script seemed to have any impact.
Made jobs work again by setting yarn.nodemanager.vmem-pmem-ratio=10. Now
they probably run with some obscene and unnecessary vmem allocation
(which I read does not come for free with the new malloc). What a crappy
situation (and change) :-(
Thanks,
Henning
On 10/25/2012 11:47 AM, Henning Blohm wrote:
Recently I have installed data nodes on Ubuntu 12.04 and observed
failing M/R jobs with errors like this:
Diagnostics report from attempt_1351154628597_0002_m_000000_0:
Container
[pid=14529,containerID=container_1351154628597_0002_01_000002] is
running beyond virtual memory limits. Current usage: 124.4mb of 1.0gb
physical memory used; 2.1gb of 2.1gb virtual memory used. Killing
container.
Dump of the process-tree for container_1351154628597_0002_01_000002 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS)
SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 14529 13550 14529 14529 (java) 678 18 2265411584 31856
/home/gd/gd/jdk1.6.0_35/bin/java -Djava.net.preferIPv4Stack=true
-Dhadoop.metrics.log.level=WARN -Xmx1000M -XX:MaxPermSize=512M
-Djava.io.tmpdir=/home/gd/gd/gi-de-nosql.cdh4-base/data/yarn/usercache/gd/appcache/application_1351154628597_0002/container_1351154628597_0002_01_000002/tmp
-Dlog4j.configuration=container-log4j.properties
-Dyarn.app.mapreduce.container.log.dir=/home/gd/gd/gi-de-nosql.cdh4-base/logs/application_1351154628597_0002/container_1351154628597_0002_01_000002
-Dyarn.app.mapreduce.container.log.filesize=0
-Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild
192.168.178.25 36183 attempt_1351154628597_0002_m_000000_0 2
I am using CDH4.0.1 (hadoop 2.0.0) with the Yarn M/R implementation on
Ubuntu 12.04 64Bit.
According to HADOOP-7154 making sure MALLOC_ARENA_MAX=1 (or 4) is
exported should fix the issue.
I tried the following:
Exporting the environment variable MALLOC_ARENA_MAX with value 1 in
all hadoop shell scrips (e.g. yarn-env.sh). Checking the
launch_container.sh script that Yarn creates I can tell that it indeed
contains the line
export MALLOC_ARENA_MAX="1"
But still I am getting the error above.
In addition I tried adding
<property>
<name>mapred.child.env</name>
<value>MALLOC_ARENA_MAX=1</value>
</property>
to mapred-site.xml. But that didn't seem to fix it either.
Is there anything special that I need to configure on the server to
make the setting effective?
Any idea would be great!!
Thanks,
Henning