[jira] [Comment Edited] (YARN-6470) Node manager offheap memory leak
[ https://issues.apache.org/jira/browse/YARN-6470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15965697#comment-15965697 ] vishal.rajan edited comment on YARN-6470 at 4/17/17 4:37 AM: - Would be great if someone could take at this issue. was (Author: vishal.rajan): yarn-site config > Node manager offheap memory leak > > > Key: YARN-6470 > URL: https://issues.apache.org/jira/browse/YARN-6470 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.7.1 >Reporter: vishal.rajan > Attachments: heap, test, yarn-site.xml > > > In our production environment we are seeing NM process leaking memory, this > is off heap memory.NM jvm seems to be using memory within xmx limits. > output of top > PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND > > > 32668 yarn 20 0 18.1g 16g 7244 S19 17.0 55659:03 java > output of ps > yarn 32668 1 33 2016 ?38-15:39:18 > /usr/lib/jvm/jdk-8-oracle-x64/bin/java -Dproc_nodemanager -Xmx4096m > -Dhdp.version=2.4.0.0-169 -Dhadoop.log.dir=/grid/1/log/hadoop-yarn/yarn > OS:debian7 > java version: oracle java8u25. > Yarn nodemanager-recovery is enabled. > The process needs to run for a month for this leak to occur.Please let us > know if more stats/info are required. > Have attached smaps and heapdump file. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Issue Comment Deleted] (YARN-6470) Node manager offheap memory leak
[ https://issues.apache.org/jira/browse/YARN-6470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vishal.rajan updated YARN-6470: --- Comment: was deleted (was: jvm heap dump) > Node manager offheap memory leak > > > Key: YARN-6470 > URL: https://issues.apache.org/jira/browse/YARN-6470 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.7.1 >Reporter: vishal.rajan > Attachments: heap, test, yarn-site.xml > > > In our production environment we are seeing NM process leaking memory, this > is off heap memory.NM jvm seems to be using memory within xmx limits. > output of top > PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND > > > 32668 yarn 20 0 18.1g 16g 7244 S19 17.0 55659:03 java > output of ps > yarn 32668 1 33 2016 ?38-15:39:18 > /usr/lib/jvm/jdk-8-oracle-x64/bin/java -Dproc_nodemanager -Xmx4096m > -Dhdp.version=2.4.0.0-169 -Dhadoop.log.dir=/grid/1/log/hadoop-yarn/yarn > OS:debian7 > java version: oracle java8u25. > Yarn nodemanager-recovery is enabled. > The process needs to run for a month for this leak to occur.Please let us > know if more stats/info are required. > Have attached smaps and heapdump file. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-6470) Node manager offheap memory leak
[ https://issues.apache.org/jira/browse/YARN-6470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15965661#comment-15965661 ] vishal.rajan edited comment on YARN-6470 at 4/17/17 4:36 AM: - smaps file,jvm heap dump and yarn-site attached. was (Author: vishal.rajan): smaps file attached. > Node manager offheap memory leak > > > Key: YARN-6470 > URL: https://issues.apache.org/jira/browse/YARN-6470 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.7.1 >Reporter: vishal.rajan > Attachments: heap, test, yarn-site.xml > > > In our production environment we are seeing NM process leaking memory, this > is off heap memory.NM jvm seems to be using memory within xmx limits. > output of top > PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND > > > 32668 yarn 20 0 18.1g 16g 7244 S19 17.0 55659:03 java > output of ps > yarn 32668 1 33 2016 ?38-15:39:18 > /usr/lib/jvm/jdk-8-oracle-x64/bin/java -Dproc_nodemanager -Xmx4096m > -Dhdp.version=2.4.0.0-169 -Dhadoop.log.dir=/grid/1/log/hadoop-yarn/yarn > OS:debian7 > java version: oracle java8u25. > Yarn nodemanager-recovery is enabled. > The process needs to run for a month for this leak to occur.Please let us > know if more stats/info are required. > Have attached smaps and heapdump file. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6470) Node manager offheap memory leak
[ https://issues.apache.org/jira/browse/YARN-6470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vishal.rajan updated YARN-6470: --- Attachment: yarn-site.xml yarn-site config > Node manager offheap memory leak > > > Key: YARN-6470 > URL: https://issues.apache.org/jira/browse/YARN-6470 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.7.1 >Reporter: vishal.rajan > Attachments: heap, test, yarn-site.xml > > > In our production environment we are seeing NM process leaking memory, this > is off heap memory.NM jvm seems to be using memory within xmx limits. > output of top > PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND > > > 32668 yarn 20 0 18.1g 16g 7244 S19 17.0 55659:03 java > output of ps > yarn 32668 1 33 2016 ?38-15:39:18 > /usr/lib/jvm/jdk-8-oracle-x64/bin/java -Dproc_nodemanager -Xmx4096m > -Dhdp.version=2.4.0.0-169 -Dhadoop.log.dir=/grid/1/log/hadoop-yarn/yarn > OS:debian7 > java version: oracle java8u25. > Yarn nodemanager-recovery is enabled. > The process needs to run for a month for this leak to occur.Please let us > know if more stats/info are required. > Have attached smaps and heapdump file. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6470) Node manager offheap memory leak
[ https://issues.apache.org/jira/browse/YARN-6470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vishal.rajan updated YARN-6470: --- Summary: Node manager offheap memory leak (was: node manager offheap memory leak) > Node manager offheap memory leak > > > Key: YARN-6470 > URL: https://issues.apache.org/jira/browse/YARN-6470 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.7.1 >Reporter: vishal.rajan > Attachments: heap, test > > > In our production environment we are seeing NM process leaking memory, this > is off heap memory.NM jvm seems to be using memory within xmx limits. > output of top > PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND > > > 32668 yarn 20 0 18.1g 16g 7244 S19 17.0 55659:03 java > output of ps > yarn 32668 1 33 2016 ?38-15:39:18 > /usr/lib/jvm/jdk-8-oracle-x64/bin/java -Dproc_nodemanager -Xmx4096m > -Dhdp.version=2.4.0.0-169 -Dhadoop.log.dir=/grid/1/log/hadoop-yarn/yarn > OS:debian7 > java version: oracle java8u25. > Yarn nodemanager-recovery is enabled. > The process needs to run for a month for this leak to occur.Please let us > know if more stats/info are required. > Have attached smaps and heapdump file. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6470) node manager offheap memory leak
[ https://issues.apache.org/jira/browse/YARN-6470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vishal.rajan updated YARN-6470: --- Description: In our production environment we are seeing NM process leaking memory, this is off heap memory.NM jvm seems to be using memory within xmx limits. output of top PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 32668 yarn 20 0 18.1g 16g 7244 S19 17.0 55659:03 java output of ps yarn 32668 1 33 2016 ?38-15:39:18 /usr/lib/jvm/jdk-8-oracle-x64/bin/java -Dproc_nodemanager -Xmx4096m -Dhdp.version=2.4.0.0-169 -Dhadoop.log.dir=/grid/1/log/hadoop-yarn/yarn OS:debian7 java version: oracle java8u25. Yarn nodemanager-recovery is enabled. The process needs to run for a month for this leak to occur.Please let us know if more stats/info are required. Have attached smaps and heapdump file. was: In our production environment we are seeing NM process leaking memory, this is off heap memory.NM jvm seems to be using memory within xmx limits. output of top PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 32668 yarn 20 0 18.1g 16g 7244 S19 17.0 55659:03 java output of ps yarn 32668 1 33 2016 ?38-15:39:18 /usr/lib/jvm/jdk-8-oracle-x64/bin/java -Dproc_nodemanager -Xmx4096m -Dhdp.version=2.4.0.0-169 -Dhadoop.log.dir=/grid/1/log/hadoop-yarn/yarn we use debian 7 OS and our java version is java8u25. Yarn nodemanager-recovery is enabled. The process needs to run for a month for this leak to occur.Please let us know if more stats/info are required. Have attached smaps and heapdump file. > node manager offheap memory leak > > > Key: YARN-6470 > URL: https://issues.apache.org/jira/browse/YARN-6470 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.7.1 >Reporter: vishal.rajan > Attachments: heap, test > > > In our production environment we are seeing NM process leaking memory, this > is off heap memory.NM jvm seems to be using memory within xmx limits. > output of top > PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND > > > 32668 yarn 20 0 18.1g 16g 7244 S19 17.0 55659:03 java > output of ps > yarn 32668 1 33 2016 ?38-15:39:18 > /usr/lib/jvm/jdk-8-oracle-x64/bin/java -Dproc_nodemanager -Xmx4096m > -Dhdp.version=2.4.0.0-169 -Dhadoop.log.dir=/grid/1/log/hadoop-yarn/yarn > OS:debian7 > java version: oracle java8u25. > Yarn nodemanager-recovery is enabled. > The process needs to run for a month for this leak to occur.Please let us > know if more stats/info are required. > Have attached smaps and heapdump file. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6470) node manager offheap memory leak
[ https://issues.apache.org/jira/browse/YARN-6470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vishal.rajan updated YARN-6470: --- Description: In our production environment we are seeing NM process leaking memory, this is off heap memory.NM jvm seems to be using memory within xmx limits. output of top PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 32668 yarn 20 0 18.1g 16g 7244 S19 17.0 55659:03 java output of ps yarn 32668 1 33 2016 ?38-15:39:18 /usr/lib/jvm/jdk-8-oracle-x64/bin/java -Dproc_nodemanager -Xmx4096m -Dhdp.version=2.4.0.0-169 -Dhadoop.log.dir=/grid/1/log/hadoop-yarn/yarn we use debian 7 OS and our java version is java8u25. Yarn nodemanager-recovery is enabled. The process needs to run for a month for this leak to occur.Please let us know if more stats/info are required. Have attached smaps and heapdump file. was: In our production environment we are seeing NM process leaking memory, this is off heap memory.NM jvm seems to be using memory within xmx limits. output of top PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 32668 yarn 20 0 18.1g 16g 7244 S19 17.0 55659:03 java output of ps yarn 32668 1 33 2016 ?38-15:39:18 /usr/lib/jvm/jdk-8-oracle-x64/bin/java -Dproc_nodemanager -Xmx4096m -Dhdp.version=2.4.0.0-169 -Dhadoop.log.dir=/grid/1/log/hadoop-yarn/yarn we use debian 7 OS and our java version is java8u25. Yarn nodemanager-recovery is enabled. Have attached smaps and heapdump file > node manager offheap memory leak > > > Key: YARN-6470 > URL: https://issues.apache.org/jira/browse/YARN-6470 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.7.1 >Reporter: vishal.rajan > Attachments: heap, test > > > In our production environment we are seeing NM process leaking memory, this > is off heap memory.NM jvm seems to be using memory within xmx limits. > output of top > PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND > > > 32668 yarn 20 0 18.1g 16g 7244 S19 17.0 55659:03 java > output of ps > yarn 32668 1 33 2016 ?38-15:39:18 > /usr/lib/jvm/jdk-8-oracle-x64/bin/java -Dproc_nodemanager -Xmx4096m > -Dhdp.version=2.4.0.0-169 -Dhadoop.log.dir=/grid/1/log/hadoop-yarn/yarn > we use debian 7 OS and our java version is java8u25. > Yarn nodemanager-recovery is enabled. > The process needs to run for a month for this leak to occur.Please let us > know if more stats/info are required. > Have attached smaps and heapdump file. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6470) node manager offheap memory leak
[ https://issues.apache.org/jira/browse/YARN-6470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vishal.rajan updated YARN-6470: --- Attachment: heap jvm heap dump > node manager offheap memory leak > > > Key: YARN-6470 > URL: https://issues.apache.org/jira/browse/YARN-6470 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.7.1 >Reporter: vishal.rajan > Attachments: heap, test > > > In our production environment we are seeing NM process leaking memory, this > is off heap memory.NM jvm seems to be using memory within xmx limits. > output of top > PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND > > > 32668 yarn 20 0 18.1g 16g 7244 S19 17.0 55659:03 java > output of ps > yarn 32668 1 33 2016 ?38-15:39:18 > /usr/lib/jvm/jdk-8-oracle-x64/bin/java -Dproc_nodemanager -Xmx4096m > -Dhdp.version=2.4.0.0-169 -Dhadoop.log.dir=/grid/1/log/hadoop-yarn/yarn > we use debian 7 OS and our java version is java8u25. > Yarn nodemanager-recovery is enabled. > Have attached smaps and heapdump file -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6470) node manager offheap memory leak
[ https://issues.apache.org/jira/browse/YARN-6470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vishal.rajan updated YARN-6470: --- Attachment: test smaps file attached. > node manager offheap memory leak > > > Key: YARN-6470 > URL: https://issues.apache.org/jira/browse/YARN-6470 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.7.1 >Reporter: vishal.rajan > Attachments: test > > > In our production environment we are seeing NM process leaking memory, this > is off heap memory.NM jvm seems to be using memory within xmx limits. > output of top > PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND > > > 32668 yarn 20 0 18.1g 16g 7244 S19 17.0 55659:03 java > output of ps > yarn 32668 1 33 2016 ?38-15:39:18 > /usr/lib/jvm/jdk-8-oracle-x64/bin/java -Dproc_nodemanager -Xmx4096m > -Dhdp.version=2.4.0.0-169 -Dhadoop.log.dir=/grid/1/log/hadoop-yarn/yarn > we use debian 7 OS and our java version is java8u25. > Yarn nodemanager-recovery is enabled. > Have attached smaps and heapdump file -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-6470) node manager offheap memory leak
vishal.rajan created YARN-6470: -- Summary: node manager offheap memory leak Key: YARN-6470 URL: https://issues.apache.org/jira/browse/YARN-6470 Project: Hadoop YARN Issue Type: Bug Components: yarn Affects Versions: 2.7.1 Reporter: vishal.rajan In our production environment we are seeing NM process leaking memory, this is off heap memory.NM jvm seems to be using memory within xmx limits. output of top PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 32668 yarn 20 0 18.1g 16g 7244 S19 17.0 55659:03 java output of ps yarn 32668 1 33 2016 ?38-15:39:18 /usr/lib/jvm/jdk-8-oracle-x64/bin/java -Dproc_nodemanager -Xmx4096m -Dhdp.version=2.4.0.0-169 -Dhadoop.log.dir=/grid/1/log/hadoop-yarn/yarn we use debian 7 OS and our java version is java8u25. Yarn nodemanager-recovery is enabled. Have attached smaps and heapdump file -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-6017) node manager physical memory leak
[ https://issues.apache.org/jira/browse/YARN-6017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15965613#comment-15965613 ] vishal.rajan edited comment on YARN-6017 at 4/12/17 9:39 AM: - This doesn't seem like a java issue as we had updated java8u25 to java8u90. [~bibinchundatt] would be great if you could help. was (Author: vishal.rajan): This doesn't seem like a java issue as we had updated java8u25 to java8u90. [~bibinchundatt] would be great if you could update your findings. > node manager physical memory leak > - > > Key: YARN-6017 > URL: https://issues.apache.org/jira/browse/YARN-6017 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.7.1 > Environment: OS: > Linux guomai124041 2.6.32-431.el6.x86_64 #1 SMP Fri Nov 22 03:15:09 UTC 2013 > x86_64 x86_64 x86_64 GNU/Linux > jvm: > java version "1.7.0_65" > Java(TM) SE Runtime Environment (build 1.7.0_65-b17) > Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode) >Reporter: chenrongwei > Attachments: 31169_smaps.txt, 31169_smaps.txt > > > In our produce environment, node manager's jvm memory has been set to > '-Xmx2048m',but we notice that after a long time running the process' actual > physical memory size had been reached to 12g (we got this value by top > command as follow). > PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND > 31169 data 20 0 13.2g 12g 6092 S 16.9 13.0 49183:13 java > 31169: /usr/local/jdk/bin/java -Dproc_nodemanager -Xmx2048m > -Dhadoop.log.dir=/home/data/programs/apache-hadoop-2.7.1/logs > -Dyarn.log.dir=/home/data/programs/apache-hadoop-2.7.1/logs > -Dhadoop.log.file=yarn-data-nodemanager.log > -Dyarn.log.file=yarn-data-nodemanager.log -Dyarn.home.dir= -Dyarn.id.str=data > -Dhadoop.root.logger=INFO,RFA -Dyarn.root.logger=INFO,RFA > -Djava.library.path=/home/data/programs/apache-hadoop-2.7.1/lib/native > -Dyarn.policy.file=hadoop-policy.xml -XX:PermSize=128M -XX:MaxPermSize=256M > -XX:+UseC > Address Kbytes Mode Offset DeviceMapping > 0040 4 r-x-- 008:1 java > 0060 4 rw--- 008:1 java > 00601000 10094936 rw--- 000:0 [ anon ] > 00077000 2228224 rw--- 000:0 [ anon ] > 0007f800 131072 rw--- 000:0 [ anon ] > 00325ee0 128 r-x-- 008:1 ld-2.12.so > 00325f01f000 4 r 0001f000 008:1 ld-2.12.so > 00325f02 4 rw--- 0002 008:1 ld-2.12.so > 00325f021000 4 rw--- 000:0 [ anon ] > 00325f201576 r-x-- 008:1 libc-2.12.so > 00325f38a0002048 - 0018a000 008:1 libc-2.12.so > 00325f58a000 16 r 0018a000 008:1 libc-2.12.so > 00325f58e000 4 rw--- 0018e000 008:1 libc-2.12.so > 00325f58f000 20 rw--- 000:0 [ anon ] > 00325f60 92 r-x-- 008:1 libpthread-2.12.so > 00325f6170002048 - 00017000 008:1 libpthread-2.12.so > 00325f817000 4 r 00017000 008:1 libpthread-2.12.so > 00325f818000 4 rw--- 00018000 008:1 libpthread-2.12.so > 00325f819000 16 rw--- 000:0 [ anon ] > 00325fa0 8 r-x-- 008:1 libdl-2.12.so > 00325fa020002048 - 2000 008:1 libdl-2.12.so > 00325fc02000 4 r 2000 008:1 libdl-2.12.so > 00325fc03000 4 rw--- 3000 008:1 libdl-2.12.so > 00325fe0 28 r-x-- 008:1 librt-2.12.so > 00325fe070002044 - 7000 008:1 librt-2.12.so > 003260006000 4 r 6000 008:1 librt-2.12.so > 003260007000 4 rw--- 7000 008:1 librt-2.12.so > 00326020 524 r-x-- 008:1 libm-2.12.so > 0032602830002044 - 00083000 008:1 libm-2.12.so > 003260482000 4 r 00082000 008:1 libm-2.12.so > 003260483000 4 rw--- 00083000 008:1 libm-2.12.so > 00326120 88 r-x-- 008:1 libresolv-2.12.so > 0032612160002048 - 00016000 008:1 libresolv-2.12.so > 003261416000 4 r 00016000 008:1 libresolv-2.12.so > 003261417000 4 rw--- 00017000 008:1 libresolv-2.12.so -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe,
[jira] [Commented] (YARN-6017) node manager physical memory leak
[ https://issues.apache.org/jira/browse/YARN-6017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15965613#comment-15965613 ] vishal.rajan commented on YARN-6017: This doesn't seem like a java issue as we had updated java8u25 to java8u90. [~bibinchundatt] would be great if you could update your findings. > node manager physical memory leak > - > > Key: YARN-6017 > URL: https://issues.apache.org/jira/browse/YARN-6017 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.7.1 > Environment: OS: > Linux guomai124041 2.6.32-431.el6.x86_64 #1 SMP Fri Nov 22 03:15:09 UTC 2013 > x86_64 x86_64 x86_64 GNU/Linux > jvm: > java version "1.7.0_65" > Java(TM) SE Runtime Environment (build 1.7.0_65-b17) > Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode) >Reporter: chenrongwei > Attachments: 31169_smaps.txt, 31169_smaps.txt > > > In our produce environment, node manager's jvm memory has been set to > '-Xmx2048m',but we notice that after a long time running the process' actual > physical memory size had been reached to 12g (we got this value by top > command as follow). > PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND > 31169 data 20 0 13.2g 12g 6092 S 16.9 13.0 49183:13 java > 31169: /usr/local/jdk/bin/java -Dproc_nodemanager -Xmx2048m > -Dhadoop.log.dir=/home/data/programs/apache-hadoop-2.7.1/logs > -Dyarn.log.dir=/home/data/programs/apache-hadoop-2.7.1/logs > -Dhadoop.log.file=yarn-data-nodemanager.log > -Dyarn.log.file=yarn-data-nodemanager.log -Dyarn.home.dir= -Dyarn.id.str=data > -Dhadoop.root.logger=INFO,RFA -Dyarn.root.logger=INFO,RFA > -Djava.library.path=/home/data/programs/apache-hadoop-2.7.1/lib/native > -Dyarn.policy.file=hadoop-policy.xml -XX:PermSize=128M -XX:MaxPermSize=256M > -XX:+UseC > Address Kbytes Mode Offset DeviceMapping > 0040 4 r-x-- 008:1 java > 0060 4 rw--- 008:1 java > 00601000 10094936 rw--- 000:0 [ anon ] > 00077000 2228224 rw--- 000:0 [ anon ] > 0007f800 131072 rw--- 000:0 [ anon ] > 00325ee0 128 r-x-- 008:1 ld-2.12.so > 00325f01f000 4 r 0001f000 008:1 ld-2.12.so > 00325f02 4 rw--- 0002 008:1 ld-2.12.so > 00325f021000 4 rw--- 000:0 [ anon ] > 00325f201576 r-x-- 008:1 libc-2.12.so > 00325f38a0002048 - 0018a000 008:1 libc-2.12.so > 00325f58a000 16 r 0018a000 008:1 libc-2.12.so > 00325f58e000 4 rw--- 0018e000 008:1 libc-2.12.so > 00325f58f000 20 rw--- 000:0 [ anon ] > 00325f60 92 r-x-- 008:1 libpthread-2.12.so > 00325f6170002048 - 00017000 008:1 libpthread-2.12.so > 00325f817000 4 r 00017000 008:1 libpthread-2.12.so > 00325f818000 4 rw--- 00018000 008:1 libpthread-2.12.so > 00325f819000 16 rw--- 000:0 [ anon ] > 00325fa0 8 r-x-- 008:1 libdl-2.12.so > 00325fa020002048 - 2000 008:1 libdl-2.12.so > 00325fc02000 4 r 2000 008:1 libdl-2.12.so > 00325fc03000 4 rw--- 3000 008:1 libdl-2.12.so > 00325fe0 28 r-x-- 008:1 librt-2.12.so > 00325fe070002044 - 7000 008:1 librt-2.12.so > 003260006000 4 r 6000 008:1 librt-2.12.so > 003260007000 4 rw--- 7000 008:1 librt-2.12.so > 00326020 524 r-x-- 008:1 libm-2.12.so > 0032602830002044 - 00083000 008:1 libm-2.12.so > 003260482000 4 r 00082000 008:1 libm-2.12.so > 003260483000 4 rw--- 00083000 008:1 libm-2.12.so > 00326120 88 r-x-- 008:1 libresolv-2.12.so > 0032612160002048 - 00016000 008:1 libresolv-2.12.so > 003261416000 4 r 00016000 008:1 libresolv-2.12.so > 003261417000 4 rw--- 00017000 008:1 libresolv-2.12.so -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6017) node manager physical memory leak
[ https://issues.apache.org/jira/browse/YARN-6017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15965612#comment-15965612 ] vishal.rajan commented on YARN-6017: We are seeing a similar issue with yarn node managers. We have enabled yarn.nodemanager.recovery.enabled and stored in leveldb java version: java8u25 hadoop version: 2.7.1 node manager xmx: 4096m PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 15599 yarn 20 0 13.2g 10g 4176 S 3 20.3 31961:10 java > node manager physical memory leak > - > > Key: YARN-6017 > URL: https://issues.apache.org/jira/browse/YARN-6017 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.7.1 > Environment: OS: > Linux guomai124041 2.6.32-431.el6.x86_64 #1 SMP Fri Nov 22 03:15:09 UTC 2013 > x86_64 x86_64 x86_64 GNU/Linux > jvm: > java version "1.7.0_65" > Java(TM) SE Runtime Environment (build 1.7.0_65-b17) > Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode) >Reporter: chenrongwei > Attachments: 31169_smaps.txt, 31169_smaps.txt > > > In our produce environment, node manager's jvm memory has been set to > '-Xmx2048m',but we notice that after a long time running the process' actual > physical memory size had been reached to 12g (we got this value by top > command as follow). > PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND > 31169 data 20 0 13.2g 12g 6092 S 16.9 13.0 49183:13 java > 31169: /usr/local/jdk/bin/java -Dproc_nodemanager -Xmx2048m > -Dhadoop.log.dir=/home/data/programs/apache-hadoop-2.7.1/logs > -Dyarn.log.dir=/home/data/programs/apache-hadoop-2.7.1/logs > -Dhadoop.log.file=yarn-data-nodemanager.log > -Dyarn.log.file=yarn-data-nodemanager.log -Dyarn.home.dir= -Dyarn.id.str=data > -Dhadoop.root.logger=INFO,RFA -Dyarn.root.logger=INFO,RFA > -Djava.library.path=/home/data/programs/apache-hadoop-2.7.1/lib/native > -Dyarn.policy.file=hadoop-policy.xml -XX:PermSize=128M -XX:MaxPermSize=256M > -XX:+UseC > Address Kbytes Mode Offset DeviceMapping > 0040 4 r-x-- 008:1 java > 0060 4 rw--- 008:1 java > 00601000 10094936 rw--- 000:0 [ anon ] > 00077000 2228224 rw--- 000:0 [ anon ] > 0007f800 131072 rw--- 000:0 [ anon ] > 00325ee0 128 r-x-- 008:1 ld-2.12.so > 00325f01f000 4 r 0001f000 008:1 ld-2.12.so > 00325f02 4 rw--- 0002 008:1 ld-2.12.so > 00325f021000 4 rw--- 000:0 [ anon ] > 00325f201576 r-x-- 008:1 libc-2.12.so > 00325f38a0002048 - 0018a000 008:1 libc-2.12.so > 00325f58a000 16 r 0018a000 008:1 libc-2.12.so > 00325f58e000 4 rw--- 0018e000 008:1 libc-2.12.so > 00325f58f000 20 rw--- 000:0 [ anon ] > 00325f60 92 r-x-- 008:1 libpthread-2.12.so > 00325f6170002048 - 00017000 008:1 libpthread-2.12.so > 00325f817000 4 r 00017000 008:1 libpthread-2.12.so > 00325f818000 4 rw--- 00018000 008:1 libpthread-2.12.so > 00325f819000 16 rw--- 000:0 [ anon ] > 00325fa0 8 r-x-- 008:1 libdl-2.12.so > 00325fa020002048 - 2000 008:1 libdl-2.12.so > 00325fc02000 4 r 2000 008:1 libdl-2.12.so > 00325fc03000 4 rw--- 3000 008:1 libdl-2.12.so > 00325fe0 28 r-x-- 008:1 librt-2.12.so > 00325fe070002044 - 7000 008:1 librt-2.12.so > 003260006000 4 r 6000 008:1 librt-2.12.so > 003260007000 4 rw--- 7000 008:1 librt-2.12.so > 00326020 524 r-x-- 008:1 libm-2.12.so > 0032602830002044 - 00083000 008:1 libm-2.12.so > 003260482000 4 r 00082000 008:1 libm-2.12.so > 003260483000 4 rw--- 00083000 008:1 libm-2.12.so > 00326120 88 r-x-- 008:1 libresolv-2.12.so > 0032612160002048 - 00016000 008:1 libresolv-2.12.so > 003261416000 4 r 00016000 008:1 libresolv-2.12.so > 003261417000 4 rw--- 00017000 008:1 libresolv-2.12.so -- This message was sent by Atlassian
[jira] [Updated] (YARN-2624) Resource Localization fails on a cluster due to existing cache directories
[ https://issues.apache.org/jira/browse/YARN-2624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vishal.rajan updated YARN-2624: --- Target Version/s: (was: 2.6.0) Affects Version/s: 2.6.0 Resource Localization fails on a cluster due to existing cache directories -- Key: YARN-2624 URL: https://issues.apache.org/jira/browse/YARN-2624 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.6.0, 2.5.1 Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Priority: Blocker Fix For: 2.6.0 Attachments: YARN-2624.001.patch, YARN-2624.001.patch We have found resource localization fails on a cluster with following error in certain cases. {noformat} INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Failed to download rsrc { { hdfs://blahhostname:8020/tmp/hive-hive/hive_2014-09-29_14-55-45_184_6531377394813896912-12/-mr-10004/95a07b90-2448-48fc-bcda-cdb7400b4975/map.xml, 1412027745352, FILE, null },pending,[(container_1411670948067_0009_02_01)],443533288192637,DOWNLOADING} java.io.IOException: Rename cannot overwrite non empty destination directory /data/yarn/nm/filecache/27 at org.apache.hadoop.fs.AbstractFileSystem.renameInternal(AbstractFileSystem.java:716) at org.apache.hadoop.fs.FilterFs.renameInternal(FilterFs.java:228) at org.apache.hadoop.fs.AbstractFileSystem.rename(AbstractFileSystem.java:659) at org.apache.hadoop.fs.FileContext.rename(FileContext.java:906) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:366) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:59) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2624) Resource Localization fails on a cluster due to existing cache directories
[ https://issues.apache.org/jira/browse/YARN-2624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14390319#comment-14390319 ] vishal.rajan commented on YARN-2624: please verify and reopen the jira Resource Localization fails on a cluster due to existing cache directories -- Key: YARN-2624 URL: https://issues.apache.org/jira/browse/YARN-2624 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.6.0, 2.5.1 Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Priority: Blocker Fix For: 2.6.0 Attachments: YARN-2624.001.patch, YARN-2624.001.patch We have found resource localization fails on a cluster with following error in certain cases. {noformat} INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Failed to download rsrc { { hdfs://blahhostname:8020/tmp/hive-hive/hive_2014-09-29_14-55-45_184_6531377394813896912-12/-mr-10004/95a07b90-2448-48fc-bcda-cdb7400b4975/map.xml, 1412027745352, FILE, null },pending,[(container_1411670948067_0009_02_01)],443533288192637,DOWNLOADING} java.io.IOException: Rename cannot overwrite non empty destination directory /data/yarn/nm/filecache/27 at org.apache.hadoop.fs.AbstractFileSystem.renameInternal(AbstractFileSystem.java:716) at org.apache.hadoop.fs.FilterFs.renameInternal(FilterFs.java:228) at org.apache.hadoop.fs.AbstractFileSystem.rename(AbstractFileSystem.java:659) at org.apache.hadoop.fs.FileContext.rename(FileContext.java:906) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:366) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:59) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2624) Resource Localization fails on a cluster due to existing cache directories
[ https://issues.apache.org/jira/browse/YARN-2624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14390305#comment-14390305 ] vishal.rajan commented on YARN-2624: seems like this issue still persist in yarn 2.6.0 under certain conditions. Dump of the log relating to this issue. 15/04/01 12:13:20 ERROR test.Job: Task error: Rename cannot overwrite non empty destination directory /grid/6/yarn/local/usercache/azkaban/filecache/344860 java.io.IOException: Rename cannot overwrite non empty destination directory /grid/6/yarn/local/usercache/azkaban/filecache/344860 at org.apache.hadoop.fs.AbstractFileSystem.renameInternal(AbstractFileSystem.java:716) at org.apache.hadoop.fs.FilterFs.renameInternal(FilterFs.java:228) at org.apache.hadoop.fs.AbstractFileSystem.rename(AbstractFileSystem.java:659) at org.apache.hadoop.fs.FileContext.rename(FileContext.java:909) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:364) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) = yarn version : hadoop-2-2-0-0-2041-yarn 2.6.0.2.2.0.0-2041 = This node was taken OOR for maintanance, and when it was added back to the cluster, seems like this 344860 directory was not removed before assigning it to the new container. Resource Localization fails on a cluster due to existing cache directories -- Key: YARN-2624 URL: https://issues.apache.org/jira/browse/YARN-2624 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.5.1 Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Priority: Blocker Fix For: 2.6.0 Attachments: YARN-2624.001.patch, YARN-2624.001.patch We have found resource localization fails on a cluster with following error in certain cases. {noformat} INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Failed to download rsrc { { hdfs://blahhostname:8020/tmp/hive-hive/hive_2014-09-29_14-55-45_184_6531377394813896912-12/-mr-10004/95a07b90-2448-48fc-bcda-cdb7400b4975/map.xml, 1412027745352, FILE, null },pending,[(container_1411670948067_0009_02_01)],443533288192637,DOWNLOADING} java.io.IOException: Rename cannot overwrite non empty destination directory /data/yarn/nm/filecache/27 at org.apache.hadoop.fs.AbstractFileSystem.renameInternal(AbstractFileSystem.java:716) at org.apache.hadoop.fs.FilterFs.renameInternal(FilterFs.java:228) at org.apache.hadoop.fs.AbstractFileSystem.rename(AbstractFileSystem.java:659) at org.apache.hadoop.fs.FileContext.rename(FileContext.java:906) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:366) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:59) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)