[jira] [Commented] (YARN-3832) Resource Localization fails on a cluster due to existing cache directories

2015-06-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14601073#comment-14601073
 ] 

Hudson commented on YARN-3832:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #969 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/969/])
YARN-3832. Resource Localization fails on a cluster due to existing cache 
directories. Contributed by Brahma Reddy Battula (jlowe: rev 
8d58512d6e6d9fe93784a9de2af0056bcc316d96)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java


 Resource Localization fails on a cluster due to existing cache directories
 --

 Key: YARN-3832
 URL: https://issues.apache.org/jira/browse/YARN-3832
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.7.0
Reporter: Ranga Swamy
Assignee: Brahma Reddy Battula
Priority: Critical
 Fix For: 2.7.1

 Attachments: YARN-3832.patch


  *We have found resource localization fails on a cluster with following 
 error.* 
  
 Got this error in hadoop-2.7.0 release which was fixed in 2.6.0 (YARN-2624)
 {noformat}
 Application application_1434703279149_0057 failed 2 times due to AM Container 
 for appattempt_1434703279149_0057_02 exited with exitCode: -1000
 For more detailed output, check application tracking 
 page:http://S0559LDPag68:45020/cluster/app/application_1434703279149_0057Then,
  click on links to logs of each attempt.
 Diagnostics: Rename cannot overwrite non empty destination directory 
 /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
 java.io.IOException: Rename cannot overwrite non empty destination directory 
 /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
 at 
 org.apache.hadoop.fs.AbstractFileSystem.renameInternal(AbstractFileSystem.java:735)
 at org.apache.hadoop.fs.FilterFs.renameInternal(FilterFs.java:244)
 at org.apache.hadoop.fs.AbstractFileSystem.rename(AbstractFileSystem.java:678)
 at org.apache.hadoop.fs.FileContext.rename(FileContext.java:958)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:366)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
 Failing this attempt. Failing the application.
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3832) Resource Localization fails on a cluster due to existing cache directories

2015-06-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14601054#comment-14601054
 ] 

Hudson commented on YARN-3832:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #239 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/239/])
YARN-3832. Resource Localization fails on a cluster due to existing cache 
directories. Contributed by Brahma Reddy Battula (jlowe: rev 
8d58512d6e6d9fe93784a9de2af0056bcc316d96)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java
* hadoop-yarn-project/CHANGES.txt


 Resource Localization fails on a cluster due to existing cache directories
 --

 Key: YARN-3832
 URL: https://issues.apache.org/jira/browse/YARN-3832
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.7.0
Reporter: Ranga Swamy
Assignee: Brahma Reddy Battula
Priority: Critical
 Fix For: 2.7.1

 Attachments: YARN-3832.patch


  *We have found resource localization fails on a cluster with following 
 error.* 
  
 Got this error in hadoop-2.7.0 release which was fixed in 2.6.0 (YARN-2624)
 {noformat}
 Application application_1434703279149_0057 failed 2 times due to AM Container 
 for appattempt_1434703279149_0057_02 exited with exitCode: -1000
 For more detailed output, check application tracking 
 page:http://S0559LDPag68:45020/cluster/app/application_1434703279149_0057Then,
  click on links to logs of each attempt.
 Diagnostics: Rename cannot overwrite non empty destination directory 
 /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
 java.io.IOException: Rename cannot overwrite non empty destination directory 
 /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
 at 
 org.apache.hadoop.fs.AbstractFileSystem.renameInternal(AbstractFileSystem.java:735)
 at org.apache.hadoop.fs.FilterFs.renameInternal(FilterFs.java:244)
 at org.apache.hadoop.fs.AbstractFileSystem.rename(AbstractFileSystem.java:678)
 at org.apache.hadoop.fs.FileContext.rename(FileContext.java:958)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:366)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
 Failing this attempt. Failing the application.
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3832) Resource Localization fails on a cluster due to existing cache directories

2015-06-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14601412#comment-14601412
 ] 

Hudson commented on YARN-3832:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #228 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/228/])
YARN-3832. Resource Localization fails on a cluster due to existing cache 
directories. Contributed by Brahma Reddy Battula (jlowe: rev 
8d58512d6e6d9fe93784a9de2af0056bcc316d96)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java
* hadoop-yarn-project/CHANGES.txt


 Resource Localization fails on a cluster due to existing cache directories
 --

 Key: YARN-3832
 URL: https://issues.apache.org/jira/browse/YARN-3832
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.7.0
Reporter: Ranga Swamy
Assignee: Brahma Reddy Battula
Priority: Critical
 Fix For: 2.7.1

 Attachments: YARN-3832.patch


  *We have found resource localization fails on a cluster with following 
 error.* 
  
 Got this error in hadoop-2.7.0 release which was fixed in 2.6.0 (YARN-2624)
 {noformat}
 Application application_1434703279149_0057 failed 2 times due to AM Container 
 for appattempt_1434703279149_0057_02 exited with exitCode: -1000
 For more detailed output, check application tracking 
 page:http://S0559LDPag68:45020/cluster/app/application_1434703279149_0057Then,
  click on links to logs of each attempt.
 Diagnostics: Rename cannot overwrite non empty destination directory 
 /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
 java.io.IOException: Rename cannot overwrite non empty destination directory 
 /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
 at 
 org.apache.hadoop.fs.AbstractFileSystem.renameInternal(AbstractFileSystem.java:735)
 at org.apache.hadoop.fs.FilterFs.renameInternal(FilterFs.java:244)
 at org.apache.hadoop.fs.AbstractFileSystem.rename(AbstractFileSystem.java:678)
 at org.apache.hadoop.fs.FileContext.rename(FileContext.java:958)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:366)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
 Failing this attempt. Failing the application.
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3832) Resource Localization fails on a cluster due to existing cache directories

2015-06-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14601372#comment-14601372
 ] 

Hudson commented on YARN-3832:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2167 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2167/])
YARN-3832. Resource Localization fails on a cluster due to existing cache 
directories. Contributed by Brahma Reddy Battula (jlowe: rev 
8d58512d6e6d9fe93784a9de2af0056bcc316d96)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java


 Resource Localization fails on a cluster due to existing cache directories
 --

 Key: YARN-3832
 URL: https://issues.apache.org/jira/browse/YARN-3832
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.7.0
Reporter: Ranga Swamy
Assignee: Brahma Reddy Battula
Priority: Critical
 Fix For: 2.7.1

 Attachments: YARN-3832.patch


  *We have found resource localization fails on a cluster with following 
 error.* 
  
 Got this error in hadoop-2.7.0 release which was fixed in 2.6.0 (YARN-2624)
 {noformat}
 Application application_1434703279149_0057 failed 2 times due to AM Container 
 for appattempt_1434703279149_0057_02 exited with exitCode: -1000
 For more detailed output, check application tracking 
 page:http://S0559LDPag68:45020/cluster/app/application_1434703279149_0057Then,
  click on links to logs of each attempt.
 Diagnostics: Rename cannot overwrite non empty destination directory 
 /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
 java.io.IOException: Rename cannot overwrite non empty destination directory 
 /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
 at 
 org.apache.hadoop.fs.AbstractFileSystem.renameInternal(AbstractFileSystem.java:735)
 at org.apache.hadoop.fs.FilterFs.renameInternal(FilterFs.java:244)
 at org.apache.hadoop.fs.AbstractFileSystem.rename(AbstractFileSystem.java:678)
 at org.apache.hadoop.fs.FileContext.rename(FileContext.java:958)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:366)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
 Failing this attempt. Failing the application.
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3832) Resource Localization fails on a cluster due to existing cache directories

2015-06-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14601448#comment-14601448
 ] 

Hudson commented on YARN-3832:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #237 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/237/])
YARN-3832. Resource Localization fails on a cluster due to existing cache 
directories. Contributed by Brahma Reddy Battula (jlowe: rev 
8d58512d6e6d9fe93784a9de2af0056bcc316d96)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java


 Resource Localization fails on a cluster due to existing cache directories
 --

 Key: YARN-3832
 URL: https://issues.apache.org/jira/browse/YARN-3832
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.7.0
Reporter: Ranga Swamy
Assignee: Brahma Reddy Battula
Priority: Critical
 Fix For: 2.7.1

 Attachments: YARN-3832.patch


  *We have found resource localization fails on a cluster with following 
 error.* 
  
 Got this error in hadoop-2.7.0 release which was fixed in 2.6.0 (YARN-2624)
 {noformat}
 Application application_1434703279149_0057 failed 2 times due to AM Container 
 for appattempt_1434703279149_0057_02 exited with exitCode: -1000
 For more detailed output, check application tracking 
 page:http://S0559LDPag68:45020/cluster/app/application_1434703279149_0057Then,
  click on links to logs of each attempt.
 Diagnostics: Rename cannot overwrite non empty destination directory 
 /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
 java.io.IOException: Rename cannot overwrite non empty destination directory 
 /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
 at 
 org.apache.hadoop.fs.AbstractFileSystem.renameInternal(AbstractFileSystem.java:735)
 at org.apache.hadoop.fs.FilterFs.renameInternal(FilterFs.java:244)
 at org.apache.hadoop.fs.AbstractFileSystem.rename(AbstractFileSystem.java:678)
 at org.apache.hadoop.fs.FileContext.rename(FileContext.java:958)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:366)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
 Failing this attempt. Failing the application.
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3832) Resource Localization fails on a cluster due to existing cache directories

2015-06-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14601466#comment-14601466
 ] 

Hudson commented on YARN-3832:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2185 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2185/])
YARN-3832. Resource Localization fails on a cluster due to existing cache 
directories. Contributed by Brahma Reddy Battula (jlowe: rev 
8d58512d6e6d9fe93784a9de2af0056bcc316d96)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java
* hadoop-yarn-project/CHANGES.txt


 Resource Localization fails on a cluster due to existing cache directories
 --

 Key: YARN-3832
 URL: https://issues.apache.org/jira/browse/YARN-3832
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.7.0
Reporter: Ranga Swamy
Assignee: Brahma Reddy Battula
Priority: Critical
 Fix For: 2.7.1

 Attachments: YARN-3832.patch


  *We have found resource localization fails on a cluster with following 
 error.* 
  
 Got this error in hadoop-2.7.0 release which was fixed in 2.6.0 (YARN-2624)
 {noformat}
 Application application_1434703279149_0057 failed 2 times due to AM Container 
 for appattempt_1434703279149_0057_02 exited with exitCode: -1000
 For more detailed output, check application tracking 
 page:http://S0559LDPag68:45020/cluster/app/application_1434703279149_0057Then,
  click on links to logs of each attempt.
 Diagnostics: Rename cannot overwrite non empty destination directory 
 /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
 java.io.IOException: Rename cannot overwrite non empty destination directory 
 /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
 at 
 org.apache.hadoop.fs.AbstractFileSystem.renameInternal(AbstractFileSystem.java:735)
 at org.apache.hadoop.fs.FilterFs.renameInternal(FilterFs.java:244)
 at org.apache.hadoop.fs.AbstractFileSystem.rename(AbstractFileSystem.java:678)
 at org.apache.hadoop.fs.FileContext.rename(FileContext.java:958)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:366)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
 Failing this attempt. Failing the application.
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3832) Resource Localization fails on a cluster due to existing cache directories

2015-06-24 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14599025#comment-14599025
 ] 

Brahma Reddy Battula commented on YARN-3832:


[~jlowe] Attached the patch to address to cleanup full dir's also..And thanks 
for your pointer's..Checked manually,deletion is happening for full dir's..

 Resource Localization fails on a cluster due to existing cache directories
 --

 Key: YARN-3832
 URL: https://issues.apache.org/jira/browse/YARN-3832
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.7.0
Reporter: Ranga Swamy
Assignee: Brahma Reddy Battula
 Attachments: YARN-3832.patch


  *We have found resource localization fails on a cluster with following 
 error.* 
  
 Got this error in hadoop-2.7.0 release which was fixed in 2.6.0 (YARN-2624)
 {noformat}
 Application application_1434703279149_0057 failed 2 times due to AM Container 
 for appattempt_1434703279149_0057_02 exited with exitCode: -1000
 For more detailed output, check application tracking 
 page:http://S0559LDPag68:45020/cluster/app/application_1434703279149_0057Then,
  click on links to logs of each attempt.
 Diagnostics: Rename cannot overwrite non empty destination directory 
 /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
 java.io.IOException: Rename cannot overwrite non empty destination directory 
 /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
 at 
 org.apache.hadoop.fs.AbstractFileSystem.renameInternal(AbstractFileSystem.java:735)
 at org.apache.hadoop.fs.FilterFs.renameInternal(FilterFs.java:244)
 at org.apache.hadoop.fs.AbstractFileSystem.rename(AbstractFileSystem.java:678)
 at org.apache.hadoop.fs.FileContext.rename(FileContext.java:958)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:366)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
 Failing this attempt. Failing the application.
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3832) Resource Localization fails on a cluster due to existing cache directories

2015-06-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14599103#comment-14599103
 ] 

Hadoop QA commented on YARN-3832:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  16m 13s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 47s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 42s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 37s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 36s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 13s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   6m  7s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| | |  44m 16s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12741464/YARN-3832.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 2ba6465 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8331/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8331/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8331/console |


This message was automatically generated.

 Resource Localization fails on a cluster due to existing cache directories
 --

 Key: YARN-3832
 URL: https://issues.apache.org/jira/browse/YARN-3832
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.7.0
Reporter: Ranga Swamy
Assignee: Brahma Reddy Battula
Priority: Critical
 Attachments: YARN-3832.patch


  *We have found resource localization fails on a cluster with following 
 error.* 
  
 Got this error in hadoop-2.7.0 release which was fixed in 2.6.0 (YARN-2624)
 {noformat}
 Application application_1434703279149_0057 failed 2 times due to AM Container 
 for appattempt_1434703279149_0057_02 exited with exitCode: -1000
 For more detailed output, check application tracking 
 page:http://S0559LDPag68:45020/cluster/app/application_1434703279149_0057Then,
  click on links to logs of each attempt.
 Diagnostics: Rename cannot overwrite non empty destination directory 
 /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
 java.io.IOException: Rename cannot overwrite non empty destination directory 
 /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
 at 
 org.apache.hadoop.fs.AbstractFileSystem.renameInternal(AbstractFileSystem.java:735)
 at org.apache.hadoop.fs.FilterFs.renameInternal(FilterFs.java:244)
 at org.apache.hadoop.fs.AbstractFileSystem.rename(AbstractFileSystem.java:678)
 at org.apache.hadoop.fs.FileContext.rename(FileContext.java:958)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:366)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
 Failing this attempt. Failing the application.
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3832) Resource Localization fails on a cluster due to existing cache directories

2015-06-24 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14599694#comment-14599694
 ] 

Jason Lowe commented on YARN-3832:
--

+1 lgtm.  Committing this.

 Resource Localization fails on a cluster due to existing cache directories
 --

 Key: YARN-3832
 URL: https://issues.apache.org/jira/browse/YARN-3832
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.7.0
Reporter: Ranga Swamy
Assignee: Brahma Reddy Battula
Priority: Critical
 Attachments: YARN-3832.patch


  *We have found resource localization fails on a cluster with following 
 error.* 
  
 Got this error in hadoop-2.7.0 release which was fixed in 2.6.0 (YARN-2624)
 {noformat}
 Application application_1434703279149_0057 failed 2 times due to AM Container 
 for appattempt_1434703279149_0057_02 exited with exitCode: -1000
 For more detailed output, check application tracking 
 page:http://S0559LDPag68:45020/cluster/app/application_1434703279149_0057Then,
  click on links to logs of each attempt.
 Diagnostics: Rename cannot overwrite non empty destination directory 
 /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
 java.io.IOException: Rename cannot overwrite non empty destination directory 
 /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
 at 
 org.apache.hadoop.fs.AbstractFileSystem.renameInternal(AbstractFileSystem.java:735)
 at org.apache.hadoop.fs.FilterFs.renameInternal(FilterFs.java:244)
 at org.apache.hadoop.fs.AbstractFileSystem.rename(AbstractFileSystem.java:678)
 at org.apache.hadoop.fs.FileContext.rename(FileContext.java:958)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:366)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
 Failing this attempt. Failing the application.
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3832) Resource Localization fails on a cluster due to existing cache directories

2015-06-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14599723#comment-14599723
 ] 

Hudson commented on YARN-3832:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8058 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8058/])
YARN-3832. Resource Localization fails on a cluster due to existing cache 
directories. Contributed by Brahma Reddy Battula (jlowe: rev 
8d58512d6e6d9fe93784a9de2af0056bcc316d96)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java


 Resource Localization fails on a cluster due to existing cache directories
 --

 Key: YARN-3832
 URL: https://issues.apache.org/jira/browse/YARN-3832
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.7.0
Reporter: Ranga Swamy
Assignee: Brahma Reddy Battula
Priority: Critical
 Fix For: 2.7.1

 Attachments: YARN-3832.patch


  *We have found resource localization fails on a cluster with following 
 error.* 
  
 Got this error in hadoop-2.7.0 release which was fixed in 2.6.0 (YARN-2624)
 {noformat}
 Application application_1434703279149_0057 failed 2 times due to AM Container 
 for appattempt_1434703279149_0057_02 exited with exitCode: -1000
 For more detailed output, check application tracking 
 page:http://S0559LDPag68:45020/cluster/app/application_1434703279149_0057Then,
  click on links to logs of each attempt.
 Diagnostics: Rename cannot overwrite non empty destination directory 
 /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
 java.io.IOException: Rename cannot overwrite non empty destination directory 
 /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
 at 
 org.apache.hadoop.fs.AbstractFileSystem.renameInternal(AbstractFileSystem.java:735)
 at org.apache.hadoop.fs.FilterFs.renameInternal(FilterFs.java:244)
 at org.apache.hadoop.fs.AbstractFileSystem.rename(AbstractFileSystem.java:678)
 at org.apache.hadoop.fs.FileContext.rename(FileContext.java:958)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:366)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
 Failing this attempt. Failing the application.
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3832) Resource Localization fails on a cluster due to existing cache directories

2015-06-24 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14599713#comment-14599713
 ] 

Brahma Reddy Battula commented on YARN-3832:


[~jlowe]thanks a lot for review and commit..

 Resource Localization fails on a cluster due to existing cache directories
 --

 Key: YARN-3832
 URL: https://issues.apache.org/jira/browse/YARN-3832
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.7.0
Reporter: Ranga Swamy
Assignee: Brahma Reddy Battula
Priority: Critical
 Fix For: 2.7.1

 Attachments: YARN-3832.patch


  *We have found resource localization fails on a cluster with following 
 error.* 
  
 Got this error in hadoop-2.7.0 release which was fixed in 2.6.0 (YARN-2624)
 {noformat}
 Application application_1434703279149_0057 failed 2 times due to AM Container 
 for appattempt_1434703279149_0057_02 exited with exitCode: -1000
 For more detailed output, check application tracking 
 page:http://S0559LDPag68:45020/cluster/app/application_1434703279149_0057Then,
  click on links to logs of each attempt.
 Diagnostics: Rename cannot overwrite non empty destination directory 
 /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
 java.io.IOException: Rename cannot overwrite non empty destination directory 
 /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
 at 
 org.apache.hadoop.fs.AbstractFileSystem.renameInternal(AbstractFileSystem.java:735)
 at org.apache.hadoop.fs.FilterFs.renameInternal(FilterFs.java:244)
 at org.apache.hadoop.fs.AbstractFileSystem.rename(AbstractFileSystem.java:678)
 at org.apache.hadoop.fs.FileContext.rename(FileContext.java:958)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:366)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
 Failing this attempt. Failing the application.
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3832) Resource Localization fails on a cluster due to existing cache directories

2015-06-23 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597805#comment-14597805
 ] 

Brahma Reddy Battula commented on YARN-3832:


[~jlowe] Sorry for late reply...After look into logs,,Came to know that  *disk 
declared bad ( since it's reached 90%) and nodes became unhealthy* 

{noformat}
2015-06-19 04:39:18,498 WARN 
org.apache.hadoop.yarn.server.nodemanager.DirectoryCollection: Directory 
/opt/hdfsdata/HA/nmlocal error, used space above threshold of 90.0%, removing 
from list of valid directories
2015-06-19 04:39:18,498 WARN 
org.apache.hadoop.yarn.server.nodemanager.DirectoryCollection: Directory 
/opt/hdfsdata/HA/nmlog error, used space above threshold of 90.0%, removing 
from list of valid directories
2015-06-19 04:39:18,498 INFO 
org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService: Disk(s) 
failed: 1/1 local-dirs are bad: /opt/hdfsdata/HA/nmlocal; 1/1 log-dirs are bad: 
/opt/hdfsdata/HA/nmlog
2015-06-19 04:39:18,499 ERROR 
org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService: Most of the 
disks failed. 1/1 local-dirs are bad: /opt/hdfsdata/HA/nmlocal; 1/1 log-dirs 
are bad: /opt/hdfsdata/HA/nmlog
{noformat}

On restart of NM, those disk turn to good..

2015-06-19 04:47:18,765 INFO 
org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService: Disk(s) 
turned good: 1/1 local-dirs are good: /opt/hdfsdata/HA/nmlocal; 1/1 log-dirs 
are good: /opt/hdfsdata/HA/nmlog..



 Resource Localization fails on a cluster due to existing cache directories
 --

 Key: YARN-3832
 URL: https://issues.apache.org/jira/browse/YARN-3832
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.7.0
Reporter: Ranga Swamy
Assignee: Brahma Reddy Battula

  *We have found resource localization fails on a cluster with following 
 error.* 
  
 Got this error in hadoop-2.7.0 release which was fixed in 2.6.0 (YARN-2624)
 {noformat}
 Application application_1434703279149_0057 failed 2 times due to AM Container 
 for appattempt_1434703279149_0057_02 exited with exitCode: -1000
 For more detailed output, check application tracking 
 page:http://S0559LDPag68:45020/cluster/app/application_1434703279149_0057Then,
  click on links to logs of each attempt.
 Diagnostics: Rename cannot overwrite non empty destination directory 
 /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
 java.io.IOException: Rename cannot overwrite non empty destination directory 
 /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
 at 
 org.apache.hadoop.fs.AbstractFileSystem.renameInternal(AbstractFileSystem.java:735)
 at org.apache.hadoop.fs.FilterFs.renameInternal(FilterFs.java:244)
 at org.apache.hadoop.fs.AbstractFileSystem.rename(AbstractFileSystem.java:678)
 at org.apache.hadoop.fs.FileContext.rename(FileContext.java:958)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:366)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
 Failing this attempt. Failing the application.
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3832) Resource Localization fails on a cluster due to existing cache directories

2015-06-23 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597873#comment-14597873
 ] 

Jason Lowe commented on YARN-3832:
--

Ah, I think that might be the clue as to what went wrong.  If the NM recreated 
the state store on startup then ResourceLocalizationService will try to cleanup 
the localized resources to prevent them from getting out of sync with the state 
store.  Unfortunately the code does this:
{code}
  private void cleanUpLocalDirs(FileContext lfs, DeletionService del) {
for (String localDir : dirsHandler.getLocalDirs()) {
  cleanUpLocalDir(lfs, del, localDir);
}
{code}

It should be calling dirsHandler.getLocalDirsForCleanup, since getLocalDirs 
will not include any disks that are full.  Since the disk was too full, it 
probably wasn't in the list of local dirs and therefore we avoided cleaning up 
the localized resources on the disk.  Later when the disk became good it tried 
to use it, but at that point the state store and localized resources on that 
disk are out of sync and new localizations can collide with old ones.

 Resource Localization fails on a cluster due to existing cache directories
 --

 Key: YARN-3832
 URL: https://issues.apache.org/jira/browse/YARN-3832
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.7.0
Reporter: Ranga Swamy
Assignee: Brahma Reddy Battula

  *We have found resource localization fails on a cluster with following 
 error.* 
  
 Got this error in hadoop-2.7.0 release which was fixed in 2.6.0 (YARN-2624)
 {noformat}
 Application application_1434703279149_0057 failed 2 times due to AM Container 
 for appattempt_1434703279149_0057_02 exited with exitCode: -1000
 For more detailed output, check application tracking 
 page:http://S0559LDPag68:45020/cluster/app/application_1434703279149_0057Then,
  click on links to logs of each attempt.
 Diagnostics: Rename cannot overwrite non empty destination directory 
 /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
 java.io.IOException: Rename cannot overwrite non empty destination directory 
 /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
 at 
 org.apache.hadoop.fs.AbstractFileSystem.renameInternal(AbstractFileSystem.java:735)
 at org.apache.hadoop.fs.FilterFs.renameInternal(FilterFs.java:244)
 at org.apache.hadoop.fs.AbstractFileSystem.rename(AbstractFileSystem.java:678)
 at org.apache.hadoop.fs.FileContext.rename(FileContext.java:958)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:366)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
 Failing this attempt. Failing the application.
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3832) Resource Localization fails on a cluster due to existing cache directories

2015-06-19 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14593424#comment-14593424
 ] 

Jason Lowe commented on YARN-3832:
--

It looks like the state store became out-of-sync with the local filesystem 
state.  Can you look back in the NM logs to see when 
/opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39 was originally created?  
Was the state store re-created or the disk declared bad/full in-between the 
creation of that directory and the error?  Seems like something would have had 
to go wrong with either storing the state or deleting the cache entry on the 
local disk for this to occur.

 Resource Localization fails on a cluster due to existing cache directories
 --

 Key: YARN-3832
 URL: https://issues.apache.org/jira/browse/YARN-3832
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.7.0
Reporter: Ranga Swamy
Assignee: Brahma Reddy Battula

  *We have found resource localization fails on a cluster with following 
 error.* 
  
 Got this error in hadoop-2.7.0 release which was fixed in 2.6.0 (YARN-2624)
 {noformat}
 Application application_1434703279149_0057 failed 2 times due to AM Container 
 for appattempt_1434703279149_0057_02 exited with exitCode: -1000
 For more detailed output, check application tracking 
 page:http://S0559LDPag68:45020/cluster/app/application_1434703279149_0057Then,
  click on links to logs of each attempt.
 Diagnostics: Rename cannot overwrite non empty destination directory 
 /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
 java.io.IOException: Rename cannot overwrite non empty destination directory 
 /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
 at 
 org.apache.hadoop.fs.AbstractFileSystem.renameInternal(AbstractFileSystem.java:735)
 at org.apache.hadoop.fs.FilterFs.renameInternal(FilterFs.java:244)
 at org.apache.hadoop.fs.AbstractFileSystem.rename(AbstractFileSystem.java:678)
 at org.apache.hadoop.fs.FileContext.rename(FileContext.java:958)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:366)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
 Failing this attempt. Failing the application.
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3832) Resource Localization fails on a cluster due to existing cache directories

2015-06-19 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14593439#comment-14593439
 ] 

Varun Saxena commented on YARN-3832:


Sorry meant for [~singar.ranga]

 Resource Localization fails on a cluster due to existing cache directories
 --

 Key: YARN-3832
 URL: https://issues.apache.org/jira/browse/YARN-3832
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.7.0
Reporter: Ranga Swamy
Assignee: Brahma Reddy Battula

  *We have found resource localization fails on a cluster with following 
 error.* 
  
 Got this error in hadoop-2.7.0 release which was fixed in 2.6.0 (YARN-2624)
 {noformat}
 Application application_1434703279149_0057 failed 2 times due to AM Container 
 for appattempt_1434703279149_0057_02 exited with exitCode: -1000
 For more detailed output, check application tracking 
 page:http://S0559LDPag68:45020/cluster/app/application_1434703279149_0057Then,
  click on links to logs of each attempt.
 Diagnostics: Rename cannot overwrite non empty destination directory 
 /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
 java.io.IOException: Rename cannot overwrite non empty destination directory 
 /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
 at 
 org.apache.hadoop.fs.AbstractFileSystem.renameInternal(AbstractFileSystem.java:735)
 at org.apache.hadoop.fs.FilterFs.renameInternal(FilterFs.java:244)
 at org.apache.hadoop.fs.AbstractFileSystem.rename(AbstractFileSystem.java:678)
 at org.apache.hadoop.fs.FileContext.rename(FileContext.java:958)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:366)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
 Failing this attempt. Failing the application.
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3832) Resource Localization fails on a cluster due to existing cache directories

2015-06-19 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14593432#comment-14593432
 ] 

Varun Saxena commented on YARN-3832:


[~ranga swamy], can you attach the logs ?

 Resource Localization fails on a cluster due to existing cache directories
 --

 Key: YARN-3832
 URL: https://issues.apache.org/jira/browse/YARN-3832
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.7.0
Reporter: Ranga Swamy
Assignee: Brahma Reddy Battula

  *We have found resource localization fails on a cluster with following 
 error.* 
  
 Got this error in hadoop-2.7.0 release which was fixed in 2.6.0 (YARN-2624)
 {noformat}
 Application application_1434703279149_0057 failed 2 times due to AM Container 
 for appattempt_1434703279149_0057_02 exited with exitCode: -1000
 For more detailed output, check application tracking 
 page:http://S0559LDPag68:45020/cluster/app/application_1434703279149_0057Then,
  click on links to logs of each attempt.
 Diagnostics: Rename cannot overwrite non empty destination directory 
 /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
 java.io.IOException: Rename cannot overwrite non empty destination directory 
 /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
 at 
 org.apache.hadoop.fs.AbstractFileSystem.renameInternal(AbstractFileSystem.java:735)
 at org.apache.hadoop.fs.FilterFs.renameInternal(FilterFs.java:244)
 at org.apache.hadoop.fs.AbstractFileSystem.rename(AbstractFileSystem.java:678)
 at org.apache.hadoop.fs.FileContext.rename(FileContext.java:958)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:366)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
 Failing this attempt. Failing the application.
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3832) Resource Localization fails on a cluster due to existing cache directories

2015-06-19 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14593541#comment-14593541
 ] 

Jason Lowe commented on YARN-3832:
--

bq. AFAIK statestore recreated after starting

So if the state store was recreated on startup and the build has YARN-2624 
fixed, any evidence why the local directories were not removed?  Even if there 
was a deletion error on shutdown, the whole local dirs should have been cleaned 
up on startup if the state store had to be initialized from scratch.

 Resource Localization fails on a cluster due to existing cache directories
 --

 Key: YARN-3832
 URL: https://issues.apache.org/jira/browse/YARN-3832
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.7.0
Reporter: Ranga Swamy
Assignee: Brahma Reddy Battula

  *We have found resource localization fails on a cluster with following 
 error.* 
  
 Got this error in hadoop-2.7.0 release which was fixed in 2.6.0 (YARN-2624)
 {noformat}
 Application application_1434703279149_0057 failed 2 times due to AM Container 
 for appattempt_1434703279149_0057_02 exited with exitCode: -1000
 For more detailed output, check application tracking 
 page:http://S0559LDPag68:45020/cluster/app/application_1434703279149_0057Then,
  click on links to logs of each attempt.
 Diagnostics: Rename cannot overwrite non empty destination directory 
 /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
 java.io.IOException: Rename cannot overwrite non empty destination directory 
 /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
 at 
 org.apache.hadoop.fs.AbstractFileSystem.renameInternal(AbstractFileSystem.java:735)
 at org.apache.hadoop.fs.FilterFs.renameInternal(FilterFs.java:244)
 at org.apache.hadoop.fs.AbstractFileSystem.rename(AbstractFileSystem.java:678)
 at org.apache.hadoop.fs.FileContext.rename(FileContext.java:958)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:366)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
 Failing this attempt. Failing the application.
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3832) Resource Localization fails on a cluster due to existing cache directories

2015-06-19 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14593497#comment-14593497
 ] 

Brahma Reddy Battula commented on YARN-3832:


[~jlowe] Thanks for looking into this issue..
{quote}Can you look back in the NM logs to see when 
/opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39 was originally 
created{quote}

I looked into the logs, it was created three days back.And NM was restarted 
today (days doent matter anyway,just for reference).And disk is not bad and not 
full.

{noformat}
2015-06-16 07:12:05,886 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
 Resource 
hdfs://hacluster/tmp/hadoop-yarn/staging/root/.staging/job_1434452428753_0004/libjars/netty-all-4.0.23.Final.jar(-/opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39/netty-all-4.0.23.Final.jar)
 transitioned from DOWNLOADING to LOCALIZED
{noformat}

While stopping the NM, it thrown the following Error..HADOOP-11878 raised for 
same..

{noformat}
2015-06-19 03:09:10,528 ERROR 
org.apache.hadoop.yarn.server.nodemanager.DeletionService: Exception during 
execution of task in DeletionService
java.lang.NullPointerException
at 
org.apache.hadoop.fs.FileContext.fixRelativePart(FileContext.java:274)
at org.apache.hadoop.fs.FileContext.delete(FileContext.java:761)
at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.deleteAsUser(DefaultContainerExecutor.java:458)
at 
org.apache.hadoop.yarn.server.nodemanager.DeletionService$FileDeletionTask.run(DeletionService.java:293)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{noformat}

AFAIK statestore recreated after starting ..Hence old state store is out of 
sync or problem with deleting the cache entries

 Resource Localization fails on a cluster due to existing cache directories
 --

 Key: YARN-3832
 URL: https://issues.apache.org/jira/browse/YARN-3832
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.7.0
Reporter: Ranga Swamy
Assignee: Brahma Reddy Battula

  *We have found resource localization fails on a cluster with following 
 error.* 
  
 Got this error in hadoop-2.7.0 release which was fixed in 2.6.0 (YARN-2624)
 {noformat}
 Application application_1434703279149_0057 failed 2 times due to AM Container 
 for appattempt_1434703279149_0057_02 exited with exitCode: -1000
 For more detailed output, check application tracking 
 page:http://S0559LDPag68:45020/cluster/app/application_1434703279149_0057Then,
  click on links to logs of each attempt.
 Diagnostics: Rename cannot overwrite non empty destination directory 
 /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
 java.io.IOException: Rename cannot overwrite non empty destination directory 
 /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
 at 
 org.apache.hadoop.fs.AbstractFileSystem.renameInternal(AbstractFileSystem.java:735)
 at org.apache.hadoop.fs.FilterFs.renameInternal(FilterFs.java:244)
 at org.apache.hadoop.fs.AbstractFileSystem.rename(AbstractFileSystem.java:678)
 at org.apache.hadoop.fs.FileContext.rename(FileContext.java:958)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:366)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
 Failing this attempt. Failing the application.
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)