[jira] [Commented] (HADOOP-14176) distcp reports beyond physical memory limits on 2.X

Ravi Prakash (JIRA) Thu, 16 Mar 2017 13:42:02 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-14176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15928858#comment-15928858
 ]


Ravi Prakash commented on HADOOP-14176:
---------------------------------------

Just fyi! [~yzhangal] is working on HADOOP-11794 and I think its orthogonal, 
but its always good to know (since that's a much bigger change). Maybe Yongjun 
can also chime in on this change

>From my reading of the patch:
1. {{mapred.job.map.memory.mb}} has been changed to 
{{mapreduce.map.memory.mb}}. This will likely remove a "deprecated 
configuration parameter" specified warning. Good!
2. Thanks to Joep for pointing out that there are 0 reducers, removing 
{{mapred.job.reduce.memory.mb}} and {{mapreduce.reduce.class}} should have no 
affect. Great!
3. I'm having a tougher time tracing the affects of removing 
{{mapred.reducer.new-api}} . Is it being passed to {{JobImpl.transition}} ?
4. Unfortunately I don't think setting {{mapreduce.map.java.opts}} to 
{{-Xmx768m}} is the right thing to do. This may cause lots of jobs (e.g. which 
were using >= 769Mb heaps) which used to run, now fail with OOMException. In 
the past when we have been faced with this choice, I have preferred to increase 
resource usage (leading to under-utilization of cluster resources) rather than 
risk failing jobs which currently work fine

Opinions?

> distcp reports beyond physical memory limits on 2.X
> ---------------------------------------------------
>
>                 Key: HADOOP-14176
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14176
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: tools/distcp
>    Affects Versions: 2.9.0
>            Reporter: Fei Hui
>            Assignee: Fei Hui
>         Attachments: HADOOP-14176-branch-2.001.patch, 
> HADOOP-14176-branch-2.002.patch, HADOOP-14176-branch-2.003.patch
>
>
> When i run distcp,  i get some errors as follow
> {quote}
> 17/02/21 15:31:18 INFO mapreduce.Job: Task Id : 
> attempt_1487645941615_0037_m_000003_0, Status : FAILED
> Container [pid=24661,containerID=container_1487645941615_0037_01_000005] is 
> running beyond physical memory limits. Current usage: 1.1 GB of 1 GB physical 
> memory used; 4.0 GB of 5 GB virtual memory used. Killing container.
> Dump of the process-tree for container_1487645941615_0037_01_000005 :
>         |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) 
> SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
>         |- 24661 24659 24661 24661 (bash) 0 0 108650496 301 /bin/bash -c 
> /usr/lib/jvm/java/bin/java -Djava.net.preferIPv4Stack=true 
> -Dhadoop.metrics.log.level=WARN  -Xmx2120m 
> -Djava.io.tmpdir=/mnt/disk4/yarn/usercache/hadoop/appcache/application_1487645941615_0037/container_1487645941615_0037_01_000005/tmp
>  -Dlog4j.configuration=container-log4j.properties 
> -Dyarn.app.container.log.dir=/mnt/disk2/log/hadoop-yarn/containers/application_1487645941615_0037/container_1487645941615_0037_01_000005
>  -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA 
> -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 192.168.1.208 
> 44048 attempt_1487645941615_0037_m_000003_0 5 
> 1>/mnt/disk2/log/hadoop-yarn/containers/application_1487645941615_0037/container_1487645941615_0037_01_000005/stdout
>  
> 2>/mnt/disk2/log/hadoop-yarn/containers/application_1487645941615_0037/container_1487645941615_0037_01_000005/stderr
>         |- 24665 24661 24661 24661 (java) 1766 336 4235558912 280699 
> /usr/lib/jvm/java/bin/java -Djava.net.preferIPv4Stack=true 
> -Dhadoop.metrics.log.level=WARN -Xmx2120m 
> -Djava.io.tmpdir=/mnt/disk4/yarn/usercache/hadoop/appcache/application_1487645941615_0037/container_1487645941615_0037_01_000005/tmp
>  -Dlog4j.configuration=container-log4j.properties 
> -Dyarn.app.container.log.dir=/mnt/disk2/log/hadoop-yarn/containers/application_1487645941615_0037/container_1487645941615_0037_01_000005
>  -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA 
> -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 192.168.1.208 
> 44048 attempt_1487645941615_0037_m_000003_0 5
> Container killed on request. Exit code is 143
> Container exited with a non-zero exit code 143
> {quote}
> Deep into the code , i find that because distcp configuration covers 
> mapred-site.xml
> {code}
>     <property>
>         <name>mapred.job.map.memory.mb</name>
>         <value>1024</value>
>     </property>
>     <property>
>         <name>mapred.job.reduce.memory.mb</name>
>         <value>1024</value>
>     </property>
> {code}
> When mapreduce.map.java.opts and mapreduce.map.memory.mb is setting in 
> mapred-default.xml, and the value is larger than setted in 
> distcp-default.xml, the error maybe occur.
> we should remove those two configurations in distcp-default.xml 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HADOOP-14176) distcp reports beyond physical memory limits on 2.X

Reply via email to