"Current usage: 1.0 GB of 1 GB physical memory used; 2.9 GB of 2.1 GB virtual memory used. Killing container."
YARN killed the container has its memory usage exceeds the max. quota. Try to update the yarn/hive configuration to allocate more memory. 2017-08-11 23:38 GMT+08:00 <jun....@nomura.com>: > Hi > > We encountered the problem that container got killed, below is the log we > get from Kylin. > > Can you please help to determine what’s the root cost?? > > The cluster has more than 300GB memory, should be more than enough to > process the data set which is only 9gb in ORC format > > > > WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in > the future versions. Consider using a different execution engine (i.e. > spark, tez) or using Hive 1.X releases. > > Query ID = monjuu-g_20170811163025_0f450a94-a309-4020-bb10-e7fab796f0dd > > Total jobs = 10 > > Stage-1 is selected by condition resolver. > > Launching Job 1 out of 10 > > Number of reduce tasks not specified. Estimated from input data size: 9 > > In order to change the average load for a reducer (in bytes): > > set hive.exec.reducers.bytes.per.reducer=<number> > > In order to limit the maximum number of reducers: > > set hive.exec.reducers.max=<number> > > In order to set a constant number of reducers: > > set mapreduce.job.reduces=<number> > > Starting Job = job_1501889990114_2877, Tracking URL = > https://xxxxxxxx:8090/proxy/application_1501889990114_2877/ > > Kill Command = /opt/mapr/hadoop/hadoop-2.7.0/bin/hadoop job -kill > job_1501889990114_2877 > > Hadoop job information for Stage-1: number of mappers: 11; number of > reducers: 9 > > 2017-08-11 16:30:39,130 Stage-1 map = 0%, reduce = 0% > > 2017-08-11 16:30:58,469 Stage-1 map = 2%, reduce = 0%, Cumulative CPU > 266.1 sec > > 2017-08-11 16:31:01,651 Stage-1 map = 12%, reduce = 0%, Cumulative CPU > 250.03 sec > > 2017-08-11 16:31:03,799 Stage-1 map = 14%, reduce = 0%, Cumulative CPU > 201.53 sec > > 2017-08-11 16:31:08,041 Stage-1 map = 15%, reduce = 0%, Cumulative CPU > 242.99 sec > > 2017-08-11 16:31:10,178 Stage-1 map = 16%, reduce = 0%, Cumulative CPU > 280.18 sec > > 2017-08-11 16:31:16,562 Stage-1 map = 19%, reduce = 0%, Cumulative CPU > 379.78 sec > > 2017-08-11 16:31:17,629 Stage-1 map = 18%, reduce = 0%, Cumulative CPU > 344.4 sec > > 2017-08-11 16:31:18,690 Stage-1 map = 19%, reduce = 0%, Cumulative CPU > 346.76 sec > > 2017-08-11 16:31:23,994 Stage-1 map = 20%, reduce = 0%, Cumulative CPU > 342.87 sec > > 2017-08-11 16:31:29,295 Stage-1 map = 23%, reduce = 0%, Cumulative CPU > 351.87 sec > > 2017-08-11 16:31:31,411 Stage-1 map = 26%, reduce = 0%, Cumulative CPU > 365.22 sec > > 2017-08-11 16:31:34,585 Stage-1 map = 27%, reduce = 0%, Cumulative CPU > 398.59 sec > > 2017-08-11 16:31:39,875 Stage-1 map = 25%, reduce = 0%, Cumulative CPU > 334.08 sec > > 2017-08-11 16:31:45,174 Stage-1 map = 26%, reduce = 0%, Cumulative CPU > 378.24 sec > > 2017-08-11 16:31:47,294 Stage-1 map = 30%, reduce = 0%, Cumulative CPU > 412.01 sec > > 2017-08-11 16:31:48,353 Stage-1 map = 31%, reduce = 0%, Cumulative CPU > 427.95 sec > > 2017-08-11 16:31:49,406 Stage-1 map = 30%, reduce = 0%, Cumulative CPU > 421.51 sec > > 2017-08-11 16:31:50,461 Stage-1 map = 28%, reduce = 0%, Cumulative CPU > 371.18 sec > > 2017-08-11 16:31:55,761 Stage-1 map = 30%, reduce = 0%, Cumulative CPU > 420.58 sec > > 2017-08-11 16:31:56,814 Stage-1 map = 31%, reduce = 0%, Cumulative CPU > 426.93 sec > > 2017-08-11 16:31:57,870 Stage-1 map = 100%, reduce = 100%, Cumulative CPU > 82.33 sec > > MapReduce Total cumulative CPU time: 1 minutes 22 seconds 330 msec > > Ended Job = job_1501889990114_2877 with errors > > Error during job, obtaining debugging information... > > Examining task ID: task_1501889990114_2877_m_000007 (and more) from job > job_1501889990114_2877 > > Examining task ID: task_1501889990114_2877_m_000001 (and more) from job > job_1501889990114_2877 > > > > Task with the most failures(4): > > ----- > > Task ID: > > task_1501889990114_2877_m_000007 > > > > ----- > > Diagnostic Messages for this Task: > > Container [pid=4049,containerID=container_e66_1501889990114_2877_01_000031] > is running beyond physical memory limits. Current usage: 1.0 GB of 1 GB > physical memory used; 2.9 GB of 2.1 GB virtual memory used. Killing > container. > > Dump of the process-tree for container_e66_1501889990114_2877_01_000031 : > > |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) > SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE > > |- 4054 4049 4049 4049 (java) 3031 114 3038318592 270714 > /opt/sunjdk/jdk1.8.0_92/bin/java -Djava.net.preferIPv4Stack=true > -Dhadoop.metrics.log.level=WARN -Xmx900m -Djava.io.tmpdir=/local/0/opt/ > hadoop-mapr/usercache/monjuu-g/appcache/application_ > 1501889990114_2877/container_e66_1501889990114_2877_01_000031/tmp > -Dlog4j.configuration=container-log4j.properties > -Dyarn.app.container.log.dir=/opt/mapr/hadoop/hadoop-2.7.0/ > logs/userlogs/application_1501889990114_2877/container_ > e66_1501889990114_2877_01_000031 -Dyarn.app.container.log.filesize=0 > -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog > org.apache.hadoop.mapred.YarnChild 10.117.142.5 25857 > attempt_1501889990114_2877_m_000007_3 72567767433247 > > |- 4049 4047 4049 4049 (bash) 0 0 108654592 306 /bin/bash > -c /opt/sunjdk/jdk1.8.0_92/bin/java -Djava.net.preferIPv4Stack=true > -Dhadoop.metrics.log.level=WARN -Xmx900m -Djava.io.tmpdir=/local/0/opt/ > hadoop-mapr/usercache/monjuu-g/appcache/application_ > 1501889990114_2877/container_e66_1501889990114_2877_01_000031/tmp > -Dlog4j.configuration=container-log4j.properties > -Dyarn.app.container.log.dir=/opt/mapr/hadoop/hadoop-2.7.0/ > logs/userlogs/application_1501889990114_2877/container_ > e66_1501889990114_2877_01_000031 -Dyarn.app.container.log.filesize=0 > -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog > org.apache.hadoop.mapred.YarnChild 10.117.142.5 25857 > attempt_1501889990114_2877_m_000007_3 72567767433247 > 1>/opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_ > 1501889990114_2877/container_e66_1501889990114_2877_01_000031/stdout > 2>/opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_ > 1501889990114_2877/container_e66_1501889990114_2877_01_000031/stderr > > > > Container killed on request. Exit code is 143 > > Container exited with a non-zero exit code 143 > > > > > > FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql. > exec.mr.MapRedTask > > MapReduce Jobs Launched: > > Stage-Stage-1: Map: 11 Reduce: 9 Cumulative CPU: 82.33 sec MAPRFS > Read: 0 MAPRFS Write: 0 FAIL > > Total MapReduce CPU Time Spent: 1 minutes 22 seconds 330 msec > > This e-mail (including any attachments) is private and confidential, may > contain proprietary or privileged information and is intended for the named > recipient(s) only. Unintended recipients are strictly prohibited from > taking action on the basis of information in this e-mail and must contact > the sender immediately, delete this e-mail (and all attachments) and > destroy any hard copies. Nomura will not accept responsibility or liability > for the accuracy or completeness of, or the presence of any virus or > disabling code in, this e-mail. If verification is sought please request a > hard copy. Any reference to the terms of executed transactions should be > treated as preliminary only and subject to formal written confirmation by > Nomura. Nomura reserves the right to retain, monitor and intercept e-mail > communications through its networks (subject to and in accordance with > applicable laws). No confidentiality or privilege is waived or lost by > Nomura by any mistransmission of this e-mail. Any reference to "Nomura" is > a reference to any entity in the Nomura Holdings, Inc. group. Please read > our Electronic Communications Legal Notice which forms part of this e-mail: > http://www.Nomura.com/email_disclaimer.htm > -- Best regards, Shaofeng Shi 史少锋