"Current usage: 1.0 GB of 1 GB physical memory used; 2.9 GB of 2.1 GB
virtual memory used. Killing container."

YARN killed the container has its memory usage exceeds the max. quota. Try
to update the yarn/hive configuration to allocate more memory.

2017-08-11 23:38 GMT+08:00 <jun....@nomura.com>:

> Hi
> We encountered the problem that container got killed, below is the log we
> get from Kylin.
> Can you please help to determine what’s the root cost??
> The cluster has more than 300GB memory, should be more than enough to
> process the data set which is only 9gb in ORC format
> WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in
> the future versions. Consider using a different execution engine (i.e.
> spark, tez) or using Hive 1.X releases.
> Query ID = monjuu-g_20170811163025_0f450a94-a309-4020-bb10-e7fab796f0dd
> Total jobs = 10
> Stage-1 is selected by condition resolver.
> Launching Job 1 out of 10
> Number of reduce tasks not specified. Estimated from input data size: 9
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
>   set mapreduce.job.reduces=<number>
> Starting Job = job_1501889990114_2877, Tracking URL =
> https://xxxxxxxx:8090/proxy/application_1501889990114_2877/
> Kill Command = /opt/mapr/hadoop/hadoop-2.7.0/bin/hadoop job  -kill
> job_1501889990114_2877
> Hadoop job information for Stage-1: number of mappers: 11; number of
> reducers: 9
> 2017-08-11 16:30:39,130 Stage-1 map = 0%,  reduce = 0%
> 2017-08-11 16:30:58,469 Stage-1 map = 2%,  reduce = 0%, Cumulative CPU
> 266.1 sec
> 2017-08-11 16:31:01,651 Stage-1 map = 12%,  reduce = 0%, Cumulative CPU
> 250.03 sec
> 2017-08-11 16:31:03,799 Stage-1 map = 14%,  reduce = 0%, Cumulative CPU
> 201.53 sec
> 2017-08-11 16:31:08,041 Stage-1 map = 15%,  reduce = 0%, Cumulative CPU
> 242.99 sec
> 2017-08-11 16:31:10,178 Stage-1 map = 16%,  reduce = 0%, Cumulative CPU
> 280.18 sec
> 2017-08-11 16:31:16,562 Stage-1 map = 19%,  reduce = 0%, Cumulative CPU
> 379.78 sec
> 2017-08-11 16:31:17,629 Stage-1 map = 18%,  reduce = 0%, Cumulative CPU
> 344.4 sec
> 2017-08-11 16:31:18,690 Stage-1 map = 19%,  reduce = 0%, Cumulative CPU
> 346.76 sec
> 2017-08-11 16:31:23,994 Stage-1 map = 20%,  reduce = 0%, Cumulative CPU
> 342.87 sec
> 2017-08-11 16:31:29,295 Stage-1 map = 23%,  reduce = 0%, Cumulative CPU
> 351.87 sec
> 2017-08-11 16:31:31,411 Stage-1 map = 26%,  reduce = 0%, Cumulative CPU
> 365.22 sec
> 2017-08-11 16:31:34,585 Stage-1 map = 27%,  reduce = 0%, Cumulative CPU
> 398.59 sec
> 2017-08-11 16:31:39,875 Stage-1 map = 25%,  reduce = 0%, Cumulative CPU
> 334.08 sec
> 2017-08-11 16:31:45,174 Stage-1 map = 26%,  reduce = 0%, Cumulative CPU
> 378.24 sec
> 2017-08-11 16:31:47,294 Stage-1 map = 30%,  reduce = 0%, Cumulative CPU
> 412.01 sec
> 2017-08-11 16:31:48,353 Stage-1 map = 31%,  reduce = 0%, Cumulative CPU
> 427.95 sec
> 2017-08-11 16:31:49,406 Stage-1 map = 30%,  reduce = 0%, Cumulative CPU
> 421.51 sec
> 2017-08-11 16:31:50,461 Stage-1 map = 28%,  reduce = 0%, Cumulative CPU
> 371.18 sec
> 2017-08-11 16:31:55,761 Stage-1 map = 30%,  reduce = 0%, Cumulative CPU
> 420.58 sec
> 2017-08-11 16:31:56,814 Stage-1 map = 31%,  reduce = 0%, Cumulative CPU
> 426.93 sec
> 2017-08-11 16:31:57,870 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU
> 82.33 sec
> MapReduce Total cumulative CPU time: 1 minutes 22 seconds 330 msec
> Ended Job = job_1501889990114_2877 with errors
> Error during job, obtaining debugging information...
> Examining task ID: task_1501889990114_2877_m_000007 (and more) from job
> job_1501889990114_2877
> Examining task ID: task_1501889990114_2877_m_000001 (and more) from job
> job_1501889990114_2877
> Task with the most failures(4):
> -----
> Task ID:
>   task_1501889990114_2877_m_000007
> -----
> Diagnostic Messages for this Task:
> Container [pid=4049,containerID=container_e66_1501889990114_2877_01_000031]
> is running beyond physical memory limits. Current usage: 1.0 GB of 1 GB
> physical memory used; 2.9 GB of 2.1 GB virtual memory used. Killing
> container.
> Dump of the process-tree for container_e66_1501889990114_2877_01_000031 :
>                 |- 4054 4049 4049 4049 (java) 3031 114 3038318592 270714
> /opt/sunjdk/jdk1.8.0_92/bin/java -Djava.net.preferIPv4Stack=true
> -Dhadoop.metrics.log.level=WARN -Xmx900m -Djava.io.tmpdir=/local/0/opt/
> hadoop-mapr/usercache/monjuu-g/appcache/application_
> 1501889990114_2877/container_e66_1501889990114_2877_01_000031/tmp
> -Dlog4j.configuration=container-log4j.properties
> -Dyarn.app.container.log.dir=/opt/mapr/hadoop/hadoop-2.7.0/
> logs/userlogs/application_1501889990114_2877/container_
> e66_1501889990114_2877_01_000031 -Dyarn.app.container.log.filesize=0
> -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog
> org.apache.hadoop.mapred.YarnChild 25857
> attempt_1501889990114_2877_m_000007_3 72567767433247
>                 |- 4049 4047 4049 4049 (bash) 0 0 108654592 306 /bin/bash
> -c /opt/sunjdk/jdk1.8.0_92/bin/java -Djava.net.preferIPv4Stack=true
> -Dhadoop.metrics.log.level=WARN  -Xmx900m -Djava.io.tmpdir=/local/0/opt/
> hadoop-mapr/usercache/monjuu-g/appcache/application_
> 1501889990114_2877/container_e66_1501889990114_2877_01_000031/tmp
> -Dlog4j.configuration=container-log4j.properties
> -Dyarn.app.container.log.dir=/opt/mapr/hadoop/hadoop-2.7.0/
> logs/userlogs/application_1501889990114_2877/container_
> e66_1501889990114_2877_01_000031 -Dyarn.app.container.log.filesize=0
> -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog
> org.apache.hadoop.mapred.YarnChild 25857
> attempt_1501889990114_2877_m_000007_3 72567767433247
> 1>/opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_
> 1501889990114_2877/container_e66_1501889990114_2877_01_000031/stdout
> 2>/opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_
> 1501889990114_2877/container_e66_1501889990114_2877_01_000031/stderr
> Container killed on request. Exit code is 143
> Container exited with a non-zero exit code 143
> FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.
> exec.mr.MapRedTask
> MapReduce Jobs Launched:
> Stage-Stage-1: Map: 11  Reduce: 9   Cumulative CPU: 82.33 sec   MAPRFS
> Read: 0 MAPRFS Write: 0 FAIL
> Total MapReduce CPU Time Spent: 1 minutes 22 seconds 330 msec
> This e-mail (including any attachments) is private and confidential, may
> contain proprietary or privileged information and is intended for the named
> recipient(s) only. Unintended recipients are strictly prohibited from
> taking action on the basis of information in this e-mail and must contact
> the sender immediately, delete this e-mail (and all attachments) and
> destroy any hard copies. Nomura will not accept responsibility or liability
> for the accuracy or completeness of, or the presence of any virus or
> disabling code in, this e-mail. If verification is sought please request a
> hard copy. Any reference to the terms of executed transactions should be
> treated as preliminary only and subject to formal written confirmation by
> Nomura. Nomura reserves the right to retain, monitor and intercept e-mail
> communications through its networks (subject to and in accordance with
> applicable laws). No confidentiality or privilege is waived or lost by
> Nomura by any mistransmission of this e-mail. Any reference to "Nomura" is
> a reference to any entity in the Nomura Holdings, Inc. group. Please read
> our Electronic Communications Legal Notice which forms part of this e-mail:
> http://www.Nomura.com/email_disclaimer.htm

Best regards,

Shaofeng Shi 史少锋

Reply via email to