Re: Tez lauche container error when use UseG1GC
Could you also attach the resource manager log ? Best Regard, Jeff Zhang From: r7raul1...@163.commailto:r7raul1...@163.com r7raul1...@163.commailto:r7raul1...@163.com Reply-To: user user@tez.apache.orgmailto:user@tez.apache.org Date: Monday, June 1, 2015 at 9:39 AM To: user user@tez.apache.orgmailto:user@tez.apache.org Subject: Re: Re: Tez lauche container error when use UseG1GC node manager log r7raul1...@163.commailto:r7raul1...@163.com From: r7raul1...@163.commailto:r7raul1...@163.com Date: 2015-06-01 09:23 To: usermailto:user@tez.apache.org Subject: Re: Re: Tez lauche container error when use UseG1GC Attach is log file. r7raul1...@163.commailto:r7raul1...@163.com From: Jianfeng (Jeff) Zhangmailto:jzh...@hortonworks.com Date: 2015-06-01 08:50 To: usermailto:user@tez.apache.org Subject: Re: Tez lauche container error when use UseG1GC From the logs, it is due to container fail to launch. I guess it is due to some yarn configuration issue. You’d better to check the node manager logs. And it looks like you haven’t enable the log aggregation, so you can’t get the node manage logs by command “yarn logs”. You need to check each node manager machine, by default the logs are located in $HADOOP_HOME/logs. In the node manager logs, you should be able to see why the container fail to launch. BTW, I would suggest you to enable the log aggregation. You can check this for details http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.2.4/bk_yarn_resource_mgt/content/ref-375ff479-e530-46d8-9f96-8b52dadb5183.1.html Best Regard, Jeff Zhang From: r7raul1...@163.commailto:r7raul1...@163.com r7raul1...@163.commailto:r7raul1...@163.com Reply-To: user user@tez.apache.orgmailto:user@tez.apache.org Date: Monday, June 1, 2015 at 8:07 AM To: user user@tez.apache.orgmailto:user@tez.apache.org Subject: Re: Re: Tez lauche container error when use UseG1GC Log is: Status: Running (Executing on YARN cluster with App id application_1432885077153_0011) VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED Map 1 FAILED 1 0 0 1 4 0 Reducer 2 KILLED 1 0 0 1 0 1 Reducer 3 KILLED 1 0 0 1 0 1 VERTICES: 00/03 [--] 0% ELAPSED TIME: 16.13 s Status: Failed Vertex failed, vertexName=Map 1, vertexId=vertex_1432885077153_0011_1_00, diagnostics=[Task failed, taskId=task_1432885077153_0011_1_00_00, diagnostics=[TaskAttempt 0 failed, info=[Container container_1432885077153_0011_01_02 finished with diagnostics set to [Container failed. Exception from container-launch. Container id: container_1432885077153_0011_01_02 Exit code: 1 Stack trace: ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:538) at org.apache.hadoop.util.Shell.run(Shell.java:455) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:196) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:299) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Container exited with a non-zero exit code 1 ]], TaskAttempt 1 failed, info=[Container container_1432885077153_0011_01_03 finished with diagnostics set to [Container failed. Exception from container-launch. Container id: container_1432885077153_0011_01_03 Exit code: 1 Stack trace: ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:538) at org.apache.hadoop.util.Shell.run(Shell.java:455) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:196) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:299) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745
Re: Re: Tez lauche container error when use UseG1GC
mapreduce.map.java.opts = -Djava.net.preferIPv4Stack=true -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps -XX:+UseNUMA -XX:+UseG1GC Map Task Maximum Heap Size mapreduce.map.java.opts.max.heap 825955249 B default value r7raul1...@163.com From: r7raul1...@163.com Date: 2015-06-01 10:27 To: user Subject: Re: Re: Tez lauche container error when use UseG1GC Fair Scheduler Preemption yarn.scheduler.fair.preemptionFalse r7raul1...@163.com From: Jianfeng (Jeff) Zhang Date: 2015-06-01 10:10 To: user Subject: Re: Tez lauche container error when use UseG1GC Could you also attach the resource manager log ? And do you enable the preemption ? Best Regard, Jeff Zhang From: r7raul1...@163.com r7raul1...@163.com Reply-To: user user@tez.apache.org Date: Monday, June 1, 2015 at 9:39 AM To: user user@tez.apache.org Subject: Re: Re: Tez lauche container error when use UseG1GC node manager log r7raul1...@163.com From: r7raul1...@163.com Date: 2015-06-01 09:23 To: user Subject: Re: Re: Tez lauche container error when use UseG1GC Attach is log file. r7raul1...@163.com From: Jianfeng (Jeff) Zhang Date: 2015-06-01 08:50 To: user Subject: Re: Tez lauche container error when use UseG1GC From the logs, it is due to container fail to launch. I guess it is due to some yarn configuration issue. You’d better to check the node manager logs. And it looks like you haven’t enable the log aggregation, so you can’t get the node manage logs by command “yarn logs”. You need to check each node manager machine, by default the logs are located in $HADOOP_HOME/logs. In the node manager logs, you should be able to see why the container fail to launch. BTW, I would suggest you to enable the log aggregation. You can check this for details http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.2.4/bk_yarn_resource_mgt/content/ref-375ff479-e530-46d8-9f96-8b52dadb5183.1.html Best Regard, Jeff Zhang From: r7raul1...@163.com r7raul1...@163.com Reply-To: user user@tez.apache.org Date: Monday, June 1, 2015 at 8:07 AM To: user user@tez.apache.org Subject: Re: Re: Tez lauche container error when use UseG1GC Log is: Status: Running (Executing on YARN cluster with App id application_1432885077153_0011) VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED Map 1 FAILED 1 0 0 1 4 0 Reducer 2 KILLED 1 0 0 1 0 1 Reducer 3 KILLED 1 0 0 1 0 1 VERTICES: 00/03 [--] 0% ELAPSED TIME: 16.13 s Status: Failed Vertex failed, vertexName=Map 1, vertexId=vertex_1432885077153_0011_1_00, diagnostics=[Task failed, taskId=task_1432885077153_0011_1_00_00, diagnostics=[TaskAttempt 0 failed, info=[Container container_1432885077153_0011_01_02 finished with diagnostics set to [Container failed. Exception from container-launch. Container id: container_1432885077153_0011_01_02 Exit code: 1 Stack trace: ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:538) at org.apache.hadoop.util.Shell.run(Shell.java:455) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:196) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:299) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Container exited with a non-zero exit code 1 ]], TaskAttempt 1 failed, info=[Container container_1432885077153_0011_01_03 finished with diagnostics set to [Container failed. Exception from container-launch. Container id: container_1432885077153_0011_01_03 Exit code: 1 Stack trace: ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:538) at org.apache.hadoop.util.Shell.run(Shell.java:455) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:196) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:299) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81
Re: Tez lauche container error when use UseG1GC
Could you also attach the resource manager log ? And do you enable the preemption ? Best Regard, Jeff Zhang From: r7raul1...@163.commailto:r7raul1...@163.com r7raul1...@163.commailto:r7raul1...@163.com Reply-To: user user@tez.apache.orgmailto:user@tez.apache.org Date: Monday, June 1, 2015 at 9:39 AM To: user user@tez.apache.orgmailto:user@tez.apache.org Subject: Re: Re: Tez lauche container error when use UseG1GC node manager log r7raul1...@163.commailto:r7raul1...@163.com From: r7raul1...@163.commailto:r7raul1...@163.com Date: 2015-06-01 09:23 To: usermailto:user@tez.apache.org Subject: Re: Re: Tez lauche container error when use UseG1GC Attach is log file. r7raul1...@163.commailto:r7raul1...@163.com From: Jianfeng (Jeff) Zhangmailto:jzh...@hortonworks.com Date: 2015-06-01 08:50 To: usermailto:user@tez.apache.org Subject: Re: Tez lauche container error when use UseG1GC From the logs, it is due to container fail to launch. I guess it is due to some yarn configuration issue. You’d better to check the node manager logs. And it looks like you haven’t enable the log aggregation, so you can’t get the node manage logs by command “yarn logs”. You need to check each node manager machine, by default the logs are located in $HADOOP_HOME/logs. In the node manager logs, you should be able to see why the container fail to launch. BTW, I would suggest you to enable the log aggregation. You can check this for details http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.2.4/bk_yarn_resource_mgt/content/ref-375ff479-e530-46d8-9f96-8b52dadb5183.1.html Best Regard, Jeff Zhang From: r7raul1...@163.commailto:r7raul1...@163.com r7raul1...@163.commailto:r7raul1...@163.com Reply-To: user user@tez.apache.orgmailto:user@tez.apache.org Date: Monday, June 1, 2015 at 8:07 AM To: user user@tez.apache.orgmailto:user@tez.apache.org Subject: Re: Re: Tez lauche container error when use UseG1GC Log is: Status: Running (Executing on YARN cluster with App id application_1432885077153_0011) VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED Map 1 FAILED 1 0 0 1 4 0 Reducer 2 KILLED 1 0 0 1 0 1 Reducer 3 KILLED 1 0 0 1 0 1 VERTICES: 00/03 [--] 0% ELAPSED TIME: 16.13 s Status: Failed Vertex failed, vertexName=Map 1, vertexId=vertex_1432885077153_0011_1_00, diagnostics=[Task failed, taskId=task_1432885077153_0011_1_00_00, diagnostics=[TaskAttempt 0 failed, info=[Container container_1432885077153_0011_01_02 finished with diagnostics set to [Container failed. Exception from container-launch. Container id: container_1432885077153_0011_01_02 Exit code: 1 Stack trace: ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:538) at org.apache.hadoop.util.Shell.run(Shell.java:455) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:196) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:299) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Container exited with a non-zero exit code 1 ]], TaskAttempt 1 failed, info=[Container container_1432885077153_0011_01_03 finished with diagnostics set to [Container failed. Exception from container-launch. Container id: container_1432885077153_0011_01_03 Exit code: 1 Stack trace: ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:538) at org.apache.hadoop.util.Shell.run(Shell.java:455) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:196) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:299) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615
Re: Re: Tez lauche container error when use UseG1GC
(ContainerLaunch.java:299) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Container exited with a non-zero exit code 1 ]]], Vertex failed as one or more tasks failed. failedTasks:1, Vertex vertex_1432885077153_0011_1_00 [Map 1] killed/failed due to:null] Vertex killed, vertexName=Reducer 2, vertexId=vertex_1432885077153_0011_1_01, diagnostics=[Vertex received Kill while in RUNNING state., Vertex killed as other vertex failed. failedTasks:0, Vertex vertex_1432885077153_0011_1_01 [Reducer 2] killed/failed due to:null] Vertex killed, vertexName=Reducer 3, vertexId=vertex_1432885077153_0011_1_02, diagnostics=[Vertex received Kill while in RUNNING state., Vertex killed as other vertex failed. failedTasks:0, Vertex vertex_1432885077153_0011_1_02 [Reducer 3] killed/failed due to:null] DAG failed due to vertex failure. failedVertices:1 killedVertices:2 FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask yarn logs -applicationId application_1432885077153_0011 15/06/01 08:07:01 INFO client.RMProxy: Connecting to ResourceManager at localhost/127.0.0.1:8032 Logs not available at /tmp/logs/root/logs/application_1432885077153_0011 Log aggregation has not completed or is not enabled. r7raul1...@163.com From: Hitesh Shah Date: 2015-05-29 23:31 To: user Subject: Re: Tez lauche container error when use UseG1GC To clarify, given that the error is showing up with container_1432885077153_0004_01_05, that means that the AM launched properly. Use “bin/yarn logs -applicationId application_1432885077153_0004 to get the logs. See if there are any errors for the logs for container_1432885077153_0004_01_05. If there are none, you will need to search for Assigning container to task” for the above container in the AM’s logs. Using this log line, you will see what host the container belongs to and you should then look at the NodeManager logs and search for the container id. The above would be a lot simpler if you have the UI setup to work against 0.5.3 but may still require you to dig through the NodeManager logs. thanks — Hitesh On May 29, 2015, at 3:48 AM, Jianfeng (Jeff) Zhang jzh...@hortonworks.com wrote: Could you check the yarn app logs to see what the error is ? If there’s still no useful info, you may refer the yarn RM/NN logs Best Regard, Jeff Zhang From: r7raul1...@163.com r7raul1...@163.com Reply-To: user user@tez.apache.org Date: Friday, May 29, 2015 at 4:16 PM To: user user@tez.apache.org Subject: Re: Tez lauche container error when use UseG1GC BTW my tez_site.xml content is: configuration property nametez.lib.uris/name valuehdfs:///apps/tez-0.5.3/tez-0.5.3.tar.gz/value /property property nametez.task.generate.counters.per.io/name valuetrue/value /property property descriptionLog history using the Timeline Server/description nametez.history.logging.service.class/name valueorg.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService/value /property property descriptionPublish configuration information to Timeline server /description nametez.runtime.convert.user-payload.to.history-text/name valuetrue/value /property property nametez.am.launch.cmd-opts/name value-XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps -XX:+UseNUMA -XX:+UseG1GC -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp//value /property /configuration r7raul1...@163.com From: r7raul1...@163.com Date: 2015-05-29 16:15 To: user Subject: Tez lauche container error when use UseG1GC I change my mapreduce.map.java.opts 's value from -Djava.net.preferIPv4Stack=true -Xmx825955249 to -Djava.net.preferIPv4Stack=true -XX:+UseG1GC -Xmx825955249 When I run query by hive 1.1.0+tez0.53 in hadoop 2.5.0. set mapreduce.framework.name=yarn-tez; set hive.execution.engine=tez; select userid,count(*) from u_data group by userid order by userid; The query return error. I found error : 2015-05-29 16:02:39,064 WARN [AsyncDispatcher event handler] container.AMContainerImpl: Container container_1432885077153_0004_01_05 finished with diagnostics set to [Container failed. Exception from container-launch. Container id: container_1432885077153_0004_01_05 Exit code: 1 Stack trace: ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:538) at org.apache.hadoop.util.Shell.run(Shell.java:455) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:196
Re: Tez lauche container error when use UseG1GC
The task container fails to launch. Have you specify some jvm related property in tez-site.xml ? Like TEZ_TASK_LAUNCH_CMD_OPTS Conflicting collector combinations in option list; please refer to the release notes for the combinations allowed Error: Could not create the Java Virtual Machine. Error: A fatal exception has occurred. Program will exit. Best Regard, Jeff Zhang From: Jianfeng Zhang jzh...@hortonworks.commailto:jzh...@hortonworks.com Reply-To: user user@tez.apache.orgmailto:user@tez.apache.org Date: Monday, June 1, 2015 at 10:10 AM To: user user@tez.apache.orgmailto:user@tez.apache.org Subject: Re: Tez lauche container error when use UseG1GC Could you also attach the resource manager log ? And do you enable the preemption ? Best Regard, Jeff Zhang From: r7raul1...@163.commailto:r7raul1...@163.com r7raul1...@163.commailto:r7raul1...@163.com Reply-To: user user@tez.apache.orgmailto:user@tez.apache.org Date: Monday, June 1, 2015 at 9:39 AM To: user user@tez.apache.orgmailto:user@tez.apache.org Subject: Re: Re: Tez lauche container error when use UseG1GC node manager log r7raul1...@163.commailto:r7raul1...@163.com From: r7raul1...@163.commailto:r7raul1...@163.com Date: 2015-06-01 09:23 To: usermailto:user@tez.apache.org Subject: Re: Re: Tez lauche container error when use UseG1GC Attach is log file. r7raul1...@163.commailto:r7raul1...@163.com From: Jianfeng (Jeff) Zhangmailto:jzh...@hortonworks.com Date: 2015-06-01 08:50 To: usermailto:user@tez.apache.org Subject: Re: Tez lauche container error when use UseG1GC From the logs, it is due to container fail to launch. I guess it is due to some yarn configuration issue. You’d better to check the node manager logs. And it looks like you haven’t enable the log aggregation, so you can’t get the node manage logs by command “yarn logs”. You need to check each node manager machine, by default the logs are located in $HADOOP_HOME/logs. In the node manager logs, you should be able to see why the container fail to launch. BTW, I would suggest you to enable the log aggregation. You can check this for details http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.2.4/bk_yarn_resource_mgt/content/ref-375ff479-e530-46d8-9f96-8b52dadb5183.1.html Best Regard, Jeff Zhang From: r7raul1...@163.commailto:r7raul1...@163.com r7raul1...@163.commailto:r7raul1...@163.com Reply-To: user user@tez.apache.orgmailto:user@tez.apache.org Date: Monday, June 1, 2015 at 8:07 AM To: user user@tez.apache.orgmailto:user@tez.apache.org Subject: Re: Re: Tez lauche container error when use UseG1GC Log is: Status: Running (Executing on YARN cluster with App id application_1432885077153_0011) VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED Map 1 FAILED 1 0 0 1 4 0 Reducer 2 KILLED 1 0 0 1 0 1 Reducer 3 KILLED 1 0 0 1 0 1 VERTICES: 00/03 [--] 0% ELAPSED TIME: 16.13 s Status: Failed Vertex failed, vertexName=Map 1, vertexId=vertex_1432885077153_0011_1_00, diagnostics=[Task failed, taskId=task_1432885077153_0011_1_00_00, diagnostics=[TaskAttempt 0 failed, info=[Container container_1432885077153_0011_01_02 finished with diagnostics set to [Container failed. Exception from container-launch. Container id: container_1432885077153_0011_01_02 Exit code: 1 Stack trace: ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:538) at org.apache.hadoop.util.Shell.run(Shell.java:455) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:196) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:299) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Container exited with a non-zero exit code 1 ]], TaskAttempt 1 failed, info=[Container container_1432885077153_0011_01_03 finished with diagnostics set to [Container failed. Exception from container-launch. Container id: container_1432885077153_0011_01_03 Exit code: 1 Stack trace: ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:538) at org.apache.hadoop.util.Shell.run(Shell.java:455
Tez lauche container error when use UseG1GC
I change my mapreduce.map.java.opts 's value from -Djava.net.preferIPv4Stack=true -Xmx825955249 to -Djava.net.preferIPv4Stack=true -XX:+UseG1GC -Xmx825955249 When I run query by hive 1.1.0+tez0.53 in hadoop 2.5.0. set mapreduce.framework.name=yarn-tez; set hive.execution.engine=tez; select userid,count(*) from u_data group by userid order by userid; The query return error. I found error : 2015-05-29 16:02:39,064 WARN [AsyncDispatcher event handler] container.AMContainerImpl: Container container_1432885077153_0004_01_05 finished with diagnostics set to [Container failed. Exception from container-launch. Container id: container_1432885077153_0004_01_05 Exit code: 1 Stack trace: ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:538) at org.apache.hadoop.util.Shell.run(Shell.java:455) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:196) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:299) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) But I try hive set hive.execution.engine=mr; hive set mapreduce.framework.name=yarn; hive select userid,count(*) from u_data group by userid order by userid limit 1; Query ID = hdfs_20150529160606_d550bca4-0341-4eb0-aace-a9018bfbb7a9 Total jobs = 2 Launching Job 1 out of 2 Number of reduce tasks not specified. Estimated from input data size: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapreduce.job.reduces=number Starting Job = job_1432885077153_0005, Tracking URL = http://localhost:8088/proxy/application_1432885077153_0005/ Kill Command = /opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hadoop/bin/hadoop job -kill job_1432885077153_0005 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1 2015-05-29 16:06:34,863 Stage-1 map = 0%, reduce = 0% 2015-05-29 16:06:40,066 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.72 sec 2015-05-29 16:06:48,366 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 2.96 sec MapReduce Total cumulative CPU time: 2 seconds 960 msec Ended Job = job_1432885077153_0005 Launching Job 2 out of 2 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapreduce.job.reduces=number Starting Job = job_1432885077153_0006, Tracking URL = http://localhost:8088/proxy/application_1432885077153_0006/ Kill Command = /opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hadoop/bin/hadoop job -kill job_1432885077153_0006 Hadoop job information for Stage-2: number of mappers: 1; number of reducers: 1 2015-05-29 16:07:03,333 Stage-2 map = 0%, reduce = 0% 2015-05-29 16:07:07,485 Stage-2 map = 100%, reduce = 0%, Cumulative CPU 1.2 sec 2015-05-29 16:07:15,739 Stage-2 map = 100%, reduce = 100%, Cumulative CPU 2.35 sec MapReduce Total cumulative CPU time: 2 seconds 350 msec Ended Job = job_1432885077153_0006 MapReduce Jobs Launched: Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 2.96 sec HDFS Read: 1985399 HDFS Write: 20068 SUCCESS Stage-Stage-2: Map: 1 Reduce: 1 Cumulative CPU: 2.35 sec HDFS Read: 24481 HDFS Write: 6 SUCCESS Total MapReduce CPU Time Spent: 5 seconds 310 msec That's ok. r7raul1...@163.com
Re: Tez lauche container error when use UseG1GC
To clarify, given that the error is showing up with container_1432885077153_0004_01_05, that means that the AM launched properly. Use “bin/yarn logs -applicationId application_1432885077153_0004 to get the logs. See if there are any errors for the logs for container_1432885077153_0004_01_05. If there are none, you will need to search for Assigning container to task” for the above container in the AM’s logs. Using this log line, you will see what host the container belongs to and you should then look at the NodeManager logs and search for the container id. The above would be a lot simpler if you have the UI setup to work against 0.5.3 but may still require you to dig through the NodeManager logs. thanks — Hitesh On May 29, 2015, at 3:48 AM, Jianfeng (Jeff) Zhang jzh...@hortonworks.com wrote: Could you check the yarn app logs to see what the error is ? If there’s still no useful info, you may refer the yarn RM/NN logs Best Regard, Jeff Zhang From: r7raul1...@163.com r7raul1...@163.com Reply-To: user user@tez.apache.org Date: Friday, May 29, 2015 at 4:16 PM To: user user@tez.apache.org Subject: Re: Tez lauche container error when use UseG1GC BTW my tez_site.xml content is: configuration property nametez.lib.uris/name valuehdfs:///apps/tez-0.5.3/tez-0.5.3.tar.gz/value /property property nametez.task.generate.counters.per.io/name valuetrue/value /property property descriptionLog history using the Timeline Server/description nametez.history.logging.service.class/name valueorg.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService/value /property property descriptionPublish configuration information to Timeline server /description nametez.runtime.convert.user-payload.to.history-text/name valuetrue/value /property property nametez.am.launch.cmd-opts/name value-XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps -XX:+UseNUMA -XX:+UseG1GC -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp//value /property /configuration r7raul1...@163.com From: r7raul1...@163.com Date: 2015-05-29 16:15 To: user Subject: Tez lauche container error when use UseG1GC I change my mapreduce.map.java.opts 's value from -Djava.net.preferIPv4Stack=true -Xmx825955249 to -Djava.net.preferIPv4Stack=true -XX:+UseG1GC -Xmx825955249 When I run query by hive 1.1.0+tez0.53 in hadoop 2.5.0. set mapreduce.framework.name=yarn-tez; set hive.execution.engine=tez; select userid,count(*) from u_data group by userid order by userid; The query return error. I found error : 2015-05-29 16:02:39,064 WARN [AsyncDispatcher event handler] container.AMContainerImpl: Container container_1432885077153_0004_01_05 finished with diagnostics set to [Container failed. Exception from container-launch. Container id: container_1432885077153_0004_01_05 Exit code: 1 Stack trace: ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:538) at org.apache.hadoop.util.Shell.run(Shell.java:455) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:196) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:299) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) But I try hive set hive.execution.engine=mr; hive set mapreduce.framework.name=yarn; hive select userid,count(*) from u_data group by userid order by userid limit 1; Query ID = hdfs_20150529160606_d550bca4-0341-4eb0-aace-a9018bfbb7a9 Total jobs = 2 Launching Job 1 out of 2 Number of reduce tasks not specified. Estimated from input data size: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapreduce.job.reduces=number Starting Job = job_1432885077153_0005, Tracking URL = http://localhost:8088/proxy/application_1432885077153_0005/ Kill Command = /opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hadoop/bin/hadoop job -kill job_1432885077153_0005 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1 2015-05-29 16:06:34,863 Stage-1 map = 0%, reduce = 0% 2015-05-29 16:06:40,066 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.72 sec 2015-05-29 16:06:48,366 Stage-1 map = 100%, reduce = 100