Hi Hitesh, will check the memory situation! Java version is: java version "1.7.0_76" Java(TM) SE Runtime Environment (build 1.7.0_76-tdc1-b13) Java HotSpot(TM) 64-Bit Server VM (build 24.76-b04, mixed mode)
I don’t think its necessarily the same java version which was used to compile (Hadoop, Tez, Datameer ?) Johannes > On 15 Apr 2015, at 18:59, Hitesh Shah <[email protected]> wrote: > > Hi Johannes > > Not sure if anyone has seen this earlier. Do you know if the machines have > enough memory to run the no. of tasks/containers that you are launching? > Also, I am assuming that you are compiling and running against the same jdk > version? > > Would you mind sharing the details on what java version are you running? > > — Hitesh > > On Apr 14, 2015, at 1:19 AM, Johannes Zillmann <[email protected]> > wrote: > >> Hey guys, >> >> in an customer environment certain Tez jobs fail to start >> >> On the client side it looks like: >> —————————————————————————— >> INFO [2015-04-08 15:19:30.213] [MrPlanRunnerV2] (YarnClientImpl.java:204) - >> Submitted application application_1428177121154_0065 >> INFO [2015-04-08 15:19:30.214] [MrPlanRunnerV2] (TezClient.java:357) - The >> url to track the Tez Session: >> http://master:8088/proxy/application_1428177121154_0065/ >> INFO [2015-04-08 15:19:33.219] [MrPlanRunnerV2] (TezClient.java:556) - App >> did not succeed. Diagnostics: Application application_1428177121154_0065 >> failed 2 times due to AM Container for appattempt_1428177121154_0065_000002 >> exited with exitCode: 134 due to: Exception from container-launch: >> org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash: line 1: 10818 >> Aborted (core dumped) /opt/teradata/jvm64/jdk7/bin/java >> -Xmx819m -server -Djava.net.preferIPv4Stack=true >> -Dhadoop.metrics.log.level=WARN -XX:+PrintGCDetails -verbose:gc >> -XX:+PrintGCTimeStamps -XX:+UseNUMA -XX:+UseParallelGC >> -Dapple.awt.UIElement=true -Djava.awt.headless=true >> -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator >> -Dlog4j.configuration=tez-container-log4j.properties >> -Dyarn.app.container.log.dir=/data3/hadoop/yarn/log/application_1428177121154_0065/container_1428177121154_0065_02_000001 >> -Dtez.root.logger=INFO,CLA -Dsun.nio.ch.bugLevel='' >> org.apache.tez.dag.app.DAGAppMaster --session > >> /data3/hadoop/yarn/log/application_1428177121154_0065/container_1428177121154_0065_02_000001/stdout >> 2> >> /data3/hadoop/yarn/log/application_1428177121154_0065/container_1428177121154_0065_02_000001/stderr >> >> org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash: line 1: 10818 >> Aborted (core dumped) /opt/teradata/jvm64/jdk7/bin/java >> -Xmx819m -server -Djava.net.preferIPv4Stack=true >> -Dhadoop.metrics.log.level=WARN -XX:+PrintGCDetails -verbose:gc >> -XX:+PrintGCTimeStamps -XX:+UseNUMA -XX:+UseParallelGC >> -Dapple.awt.UIElement=true -Djava.awt.headless=true >> -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator >> -Dlog4j.configuration=tez-container-log4j.properties >> -Dyarn.app.container.log.dir=/data3/hadoop/yarn/log/application_1428177121154_0065/container_1428177121154_0065_02_000001 >> -Dtez.root.logger=INFO,CLA -Dsun.nio.ch.bugLevel='' >> org.apache.tez.dag.app.DAGAppMaster --session > >> /data3/hadoop/yarn/log/application_1428177121154_0065/container_1428177121154_0065_02_000001/stdout >> 2> >> /data3/hadoop/yarn/log/application_1428177121154_0065/container_1428177121154_0065_02_000001/stderr >> >> at org.apache.hadoop.util.Shell.runCommand(Shell.java:505) >> at org.apache.hadoop.util.Shell.run(Shell.java:418) >> at >> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650) >> at >> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195) >> at >> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300) >> at >> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81) >> at java.util.concurrent.FutureTask.run(FutureTask.java:262) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> at java.lang.Thread.run(Thread.java:745) >> >> >> Container exited with a non-zero exit code 134 >> .Failing this attempt.. Failing the application. >> —————————————————————————— >> >> >> Then you have that for the task: >> —————————————————————————— >> Log Type: stderr >> Log Length: 429 >> java: malloc.c:3090: sYSMALLOc: Assertion `(old_top == (((mbinptr) (((char >> *) &((av)->bins[((1) - 1) * 2])) - __builtin_offsetof (struct malloc_chunk, >> fd)))) && old_size == 0) || ((unsigned long) (old_size) >= (unsigned >> long)((((__builtin_offsetof (struct malloc_chunk, fd_nextsize))+((2 * >> (sizeof(size_t))) - 1)) & ~((2 * (sizeof(size_t))) - 1))) && >> ((old_top)->size & 0x1) && ((unsigned long)old_end & pagemask) == 0)' failed. >> >> Log Type: stdout >> Log Length: 0 >> —————————————————————————— >> >> Any ideas ? >> >> Johannes >
