Hi We encountered the problem that container got killed, below is the log we get from Kylin. Can you please help to determine what’s the root cost?? The cluster has more than 300GB memory, should be more than enough to process the data set which is only 9gb in ORC format
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. Query ID = monjuu-g_20170811163025_0f450a94-a309-4020-bb10-e7fab796f0dd Total jobs = 10 Stage-1 is selected by condition resolver. Launching Job 1 out of 10 Number of reduce tasks not specified. Estimated from input data size: 9 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapreduce.job.reduces=<number> Starting Job = job_1501889990114_2877, Tracking URL = https://xxxxxxxx:8090/proxy/application_1501889990114_2877/ Kill Command = /opt/mapr/hadoop/hadoop-2.7.0/bin/hadoop job -kill job_1501889990114_2877 Hadoop job information for Stage-1: number of mappers: 11; number of reducers: 9 2017-08-11 16:30:39,130 Stage-1 map = 0%, reduce = 0% 2017-08-11 16:30:58,469 Stage-1 map = 2%, reduce = 0%, Cumulative CPU 266.1 sec 2017-08-11 16:31:01,651 Stage-1 map = 12%, reduce = 0%, Cumulative CPU 250.03 sec 2017-08-11 16:31:03,799 Stage-1 map = 14%, reduce = 0%, Cumulative CPU 201.53 sec 2017-08-11 16:31:08,041 Stage-1 map = 15%, reduce = 0%, Cumulative CPU 242.99 sec 2017-08-11 16:31:10,178 Stage-1 map = 16%, reduce = 0%, Cumulative CPU 280.18 sec 2017-08-11 16:31:16,562 Stage-1 map = 19%, reduce = 0%, Cumulative CPU 379.78 sec 2017-08-11 16:31:17,629 Stage-1 map = 18%, reduce = 0%, Cumulative CPU 344.4 sec 2017-08-11 16:31:18,690 Stage-1 map = 19%, reduce = 0%, Cumulative CPU 346.76 sec 2017-08-11 16:31:23,994 Stage-1 map = 20%, reduce = 0%, Cumulative CPU 342.87 sec 2017-08-11 16:31:29,295 Stage-1 map = 23%, reduce = 0%, Cumulative CPU 351.87 sec 2017-08-11 16:31:31,411 Stage-1 map = 26%, reduce = 0%, Cumulative CPU 365.22 sec 2017-08-11 16:31:34,585 Stage-1 map = 27%, reduce = 0%, Cumulative CPU 398.59 sec 2017-08-11 16:31:39,875 Stage-1 map = 25%, reduce = 0%, Cumulative CPU 334.08 sec 2017-08-11 16:31:45,174 Stage-1 map = 26%, reduce = 0%, Cumulative CPU 378.24 sec 2017-08-11 16:31:47,294 Stage-1 map = 30%, reduce = 0%, Cumulative CPU 412.01 sec 2017-08-11 16:31:48,353 Stage-1 map = 31%, reduce = 0%, Cumulative CPU 427.95 sec 2017-08-11 16:31:49,406 Stage-1 map = 30%, reduce = 0%, Cumulative CPU 421.51 sec 2017-08-11 16:31:50,461 Stage-1 map = 28%, reduce = 0%, Cumulative CPU 371.18 sec 2017-08-11 16:31:55,761 Stage-1 map = 30%, reduce = 0%, Cumulative CPU 420.58 sec 2017-08-11 16:31:56,814 Stage-1 map = 31%, reduce = 0%, Cumulative CPU 426.93 sec 2017-08-11 16:31:57,870 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 82.33 sec MapReduce Total cumulative CPU time: 1 minutes 22 seconds 330 msec Ended Job = job_1501889990114_2877 with errors Error during job, obtaining debugging information... Examining task ID: task_1501889990114_2877_m_000007 (and more) from job job_1501889990114_2877 Examining task ID: task_1501889990114_2877_m_000001 (and more) from job job_1501889990114_2877 Task with the most failures(4): ----- Task ID: task_1501889990114_2877_m_000007 ----- Diagnostic Messages for this Task: Container [pid=4049,containerID=container_e66_1501889990114_2877_01_000031] is running beyond physical memory limits. Current usage: 1.0 GB of 1 GB physical memory used; 2.9 GB of 2.1 GB virtual memory used. Killing container. Dump of the process-tree for container_e66_1501889990114_2877_01_000031 : |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE |- 4054 4049 4049 4049 (java) 3031 114 3038318592 270714 /opt/sunjdk/jdk1.8.0_92/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx900m -Djava.io.tmpdir=/local/0/opt/hadoop-mapr/usercache/monjuu-g/appcache/application_1501889990114_2877/container_e66_1501889990114_2877_01_000031/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_1501889990114_2877/container_e66_1501889990114_2877_01_000031 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 10.117.142.5 25857 attempt_1501889990114_2877_m_000007_3 72567767433247 |- 4049 4047 4049 4049 (bash) 0 0 108654592 306 /bin/bash -c /opt/sunjdk/jdk1.8.0_92/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx900m -Djava.io.tmpdir=/local/0/opt/hadoop-mapr/usercache/monjuu-g/appcache/application_1501889990114_2877/container_e66_1501889990114_2877_01_000031/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_1501889990114_2877/container_e66_1501889990114_2877_01_000031 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 10.117.142.5 25857 attempt_1501889990114_2877_m_000007_3 72567767433247 1>/opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_1501889990114_2877/container_e66_1501889990114_2877_01_000031/stdout 2>/opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_1501889990114_2877/container_e66_1501889990114_2877_01_000031/stderr Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143 FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask MapReduce Jobs Launched: Stage-Stage-1: Map: 11 Reduce: 9 Cumulative CPU: 82.33 sec MAPRFS Read: 0 MAPRFS Write: 0 FAIL Total MapReduce CPU Time Spent: 1 minutes 22 seconds 330 msec This e-mail (including any attachments) is private and confidential, may contain proprietary or privileged information and is intended for the named recipient(s) only. Unintended recipients are strictly prohibited from taking action on the basis of information in this e-mail and must contact the sender immediately, delete this e-mail (and all attachments) and destroy any hard copies. Nomura will not accept responsibility or liability for the accuracy or completeness of, or the presence of any virus or disabling code in, this e-mail. If verification is sought please request a hard copy. Any reference to the terms of executed transactions should be treated as preliminary only and subject to formal written confirmation by Nomura. Nomura reserves the right to retain, monitor and intercept e-mail communications through its networks (subject to and in accordance with applicable laws). No confidentiality or privilege is waived or lost by Nomura by any mistransmission of this e-mail. Any reference to "Nomura" is a reference to any entity in the Nomura Holdings, Inc. group. Please read our Electronic Communications Legal Notice which forms part of this e-mail: http://www.Nomura.com/email_disclaimer.htm