[jira] [Commented] (YARN-2931) PublicLocalizer may fail with FileNotFoundException until directory gets initialized by LocalizeRunner
[ https://issues.apache.org/jira/browse/YARN-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14238480#comment-14238480 ] Karthik Kambatla commented on YARN-2931: Initially in the description, from Anubhav: Instead we can have PublicLocalizer not depend on this and also call getInitializedLocalDirs so it can handle initialization on its own similar to non public localization PublicLocalizer may fail with FileNotFoundException until directory gets initialized by LocalizeRunner -- Key: YARN-2931 URL: https://issues.apache.org/jira/browse/YARN-2931 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-2931.001.patch When the data directory is cleaned up and NM is started with existing recovery state, because of YARN-90, it will not recreate the local dirs. This causes a PublicLocalizer to fail until getInitializedLocalDirs is called due to some LocalizeRunner for private localization. Example error {noformat} 2014-12-02 22:57:32,629 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Failed to download rsrc { { hdfs:/blah machine:8020/tmp/hive-hive/hive_2014-12-02_22-56-58_741_2045919883676051996-3/-mr-10004/8060c9dd-54b6-42fc-9d77-34b655fa5e82/reduce.xml, 1417589819618, FILE, null },pending,[(container_1417589109512_0001_02_03)],119413444132127,DOWNLOADING} java.io.FileNotFoundException: File /data/yarn/nm/filecache does not exist at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:524) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:737) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:514) at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:1051) at org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:162) at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:197) at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:724) at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:720) at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90) at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:720) at org.apache.hadoop.yarn.util.FSDownload.createDir(FSDownload.java:104) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:351) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 2014-12-02 22:57:32,629 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1417589109512_0001_02_03 transitioned from LOCALIZING to LOCALIZATION_FAILED {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2931) PublicLocalizer may fail with FileNotFoundException until directory gets initialized by LocalizeRunner
[ https://issues.apache.org/jira/browse/YARN-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14238529#comment-14238529 ] Hadoop QA commented on YARN-2931: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12685852/YARN-2931.001.patch against trunk revision 6c5bbd7. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 1219 javac compiler warnings (more than the trunk's current 1217 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6040//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6040//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6040//console This message is automatically generated. PublicLocalizer may fail with FileNotFoundException until directory gets initialized by LocalizeRunner -- Key: YARN-2931 URL: https://issues.apache.org/jira/browse/YARN-2931 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-2931.001.patch When the data directory is cleaned up and NM is started with existing recovery state, because of YARN-90, it will not recreate the local dirs. This causes a PublicLocalizer to fail until getInitializedLocalDirs is called due to some LocalizeRunner for private localization. Example error {noformat} 2014-12-02 22:57:32,629 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Failed to download rsrc { { hdfs:/blah machine:8020/tmp/hive-hive/hive_2014-12-02_22-56-58_741_2045919883676051996-3/-mr-10004/8060c9dd-54b6-42fc-9d77-34b655fa5e82/reduce.xml, 1417589819618, FILE, null },pending,[(container_1417589109512_0001_02_03)],119413444132127,DOWNLOADING} java.io.FileNotFoundException: File /data/yarn/nm/filecache does not exist at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:524) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:737) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:514) at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:1051) at org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:162) at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:197) at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:724) at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:720) at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90) at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:720) at org.apache.hadoop.yarn.util.FSDownload.createDir(FSDownload.java:104) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:351) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 2014-12-02 22:57:32,629 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1417589109512_0001_02_03 transitioned from LOCALIZING to LOCALIZATION_FAILED {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2931) PublicLocalizer may fail with FileNotFoundException until directory gets initialized by LocalizeRunner
[ https://issues.apache.org/jira/browse/YARN-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14238683#comment-14238683 ] Hadoop QA commented on YARN-2931: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12685860/YARN-2931.002.patch against trunk revision ddffcd8. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 7 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6042//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6042//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6042//console This message is automatically generated. PublicLocalizer may fail with FileNotFoundException until directory gets initialized by LocalizeRunner -- Key: YARN-2931 URL: https://issues.apache.org/jira/browse/YARN-2931 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-2931.001.patch, YARN-2931.002.patch When the data directory is cleaned up and NM is started with existing recovery state, because of YARN-90, it will not recreate the local dirs. This causes a PublicLocalizer to fail until getInitializedLocalDirs is called due to some LocalizeRunner for private localization. Example error {noformat} 2014-12-02 22:57:32,629 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Failed to download rsrc { { hdfs:/blah machine:8020/tmp/hive-hive/hive_2014-12-02_22-56-58_741_2045919883676051996-3/-mr-10004/8060c9dd-54b6-42fc-9d77-34b655fa5e82/reduce.xml, 1417589819618, FILE, null },pending,[(container_1417589109512_0001_02_03)],119413444132127,DOWNLOADING} java.io.FileNotFoundException: File /data/yarn/nm/filecache does not exist at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:524) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:737) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:514) at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:1051) at org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:162) at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:197) at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:724) at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:720) at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90) at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:720) at org.apache.hadoop.yarn.util.FSDownload.createDir(FSDownload.java:104) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:351) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 2014-12-02 22:57:32,629 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1417589109512_0001_02_03 transitioned from LOCALIZING to LOCALIZATION_FAILED {noformat} -- This message was sent by Atlassian JIRA
[jira] [Commented] (YARN-2931) PublicLocalizer may fail with FileNotFoundException until directory gets initialized by LocalizeRunner
[ https://issues.apache.org/jira/browse/YARN-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14238725#comment-14238725 ] Anubhav Dhoot commented on YARN-2931: - Findbugs donot seem related to the patch. Uploading again to retrigger PublicLocalizer may fail with FileNotFoundException until directory gets initialized by LocalizeRunner -- Key: YARN-2931 URL: https://issues.apache.org/jira/browse/YARN-2931 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-2931.001.patch, YARN-2931.002.patch, YARN-2931.002.patch When the data directory is cleaned up and NM is started with existing recovery state, because of YARN-90, it will not recreate the local dirs. This causes a PublicLocalizer to fail until getInitializedLocalDirs is called due to some LocalizeRunner for private localization. Example error {noformat} 2014-12-02 22:57:32,629 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Failed to download rsrc { { hdfs:/blah machine:8020/tmp/hive-hive/hive_2014-12-02_22-56-58_741_2045919883676051996-3/-mr-10004/8060c9dd-54b6-42fc-9d77-34b655fa5e82/reduce.xml, 1417589819618, FILE, null },pending,[(container_1417589109512_0001_02_03)],119413444132127,DOWNLOADING} java.io.FileNotFoundException: File /data/yarn/nm/filecache does not exist at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:524) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:737) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:514) at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:1051) at org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:162) at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:197) at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:724) at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:720) at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90) at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:720) at org.apache.hadoop.yarn.util.FSDownload.createDir(FSDownload.java:104) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:351) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 2014-12-02 22:57:32,629 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1417589109512_0001_02_03 transitioned from LOCALIZING to LOCALIZATION_FAILED {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2931) PublicLocalizer may fail with FileNotFoundException until directory gets initialized by LocalizeRunner
[ https://issues.apache.org/jira/browse/YARN-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14238734#comment-14238734 ] bc Wong commented on YARN-2931: --- Thanks for the fix! Some nits. ResourceLocalizationService.java * Instead of commenting out code, would just remove it. TestResourceLocalizationService.java * L950: Remove code that commented out. PublicLocalizer may fail with FileNotFoundException until directory gets initialized by LocalizeRunner -- Key: YARN-2931 URL: https://issues.apache.org/jira/browse/YARN-2931 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-2931.001.patch, YARN-2931.002.patch, YARN-2931.002.patch When the data directory is cleaned up and NM is started with existing recovery state, because of YARN-90, it will not recreate the local dirs. This causes a PublicLocalizer to fail until getInitializedLocalDirs is called due to some LocalizeRunner for private localization. Example error {noformat} 2014-12-02 22:57:32,629 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Failed to download rsrc { { hdfs:/blah machine:8020/tmp/hive-hive/hive_2014-12-02_22-56-58_741_2045919883676051996-3/-mr-10004/8060c9dd-54b6-42fc-9d77-34b655fa5e82/reduce.xml, 1417589819618, FILE, null },pending,[(container_1417589109512_0001_02_03)],119413444132127,DOWNLOADING} java.io.FileNotFoundException: File /data/yarn/nm/filecache does not exist at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:524) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:737) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:514) at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:1051) at org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:162) at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:197) at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:724) at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:720) at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90) at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:720) at org.apache.hadoop.yarn.util.FSDownload.createDir(FSDownload.java:104) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:351) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 2014-12-02 22:57:32,629 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1417589109512_0001_02_03 transitioned from LOCALIZING to LOCALIZATION_FAILED {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2931) PublicLocalizer may fail with FileNotFoundException until directory gets initialized by LocalizeRunner
[ https://issues.apache.org/jira/browse/YARN-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14238770#comment-14238770 ] Hadoop QA commented on YARN-2931: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12685898/YARN-2931.002.patch against trunk revision ddffcd8. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6046//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6046//console This message is automatically generated. PublicLocalizer may fail with FileNotFoundException until directory gets initialized by LocalizeRunner -- Key: YARN-2931 URL: https://issues.apache.org/jira/browse/YARN-2931 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-2931.001.patch, YARN-2931.002.patch, YARN-2931.002.patch When the data directory is cleaned up and NM is started with existing recovery state, because of YARN-90, it will not recreate the local dirs. This causes a PublicLocalizer to fail until getInitializedLocalDirs is called due to some LocalizeRunner for private localization. Example error {noformat} 2014-12-02 22:57:32,629 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Failed to download rsrc { { hdfs:/blah machine:8020/tmp/hive-hive/hive_2014-12-02_22-56-58_741_2045919883676051996-3/-mr-10004/8060c9dd-54b6-42fc-9d77-34b655fa5e82/reduce.xml, 1417589819618, FILE, null },pending,[(container_1417589109512_0001_02_03)],119413444132127,DOWNLOADING} java.io.FileNotFoundException: File /data/yarn/nm/filecache does not exist at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:524) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:737) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:514) at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:1051) at org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:162) at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:197) at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:724) at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:720) at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90) at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:720) at org.apache.hadoop.yarn.util.FSDownload.createDir(FSDownload.java:104) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:351) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 2014-12-02 22:57:32,629 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1417589109512_0001_02_03 transitioned from LOCALIZING to LOCALIZATION_FAILED {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2931) PublicLocalizer may fail with FileNotFoundException until directory gets initialized by LocalizeRunner
[ https://issues.apache.org/jira/browse/YARN-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14238919#comment-14238919 ] Karthik Kambatla commented on YARN-2931: +1, pending Jenkins. PublicLocalizer may fail with FileNotFoundException until directory gets initialized by LocalizeRunner -- Key: YARN-2931 URL: https://issues.apache.org/jira/browse/YARN-2931 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Priority: Critical Attachments: YARN-2931.001.patch, YARN-2931.002.patch, YARN-2931.002.patch, YARN-2931.003.patch, YARN-2931.004.patch When the data directory is cleaned up and NM is started with existing recovery state, because of YARN-90, it will not recreate the local dirs. This causes a PublicLocalizer to fail until getInitializedLocalDirs is called due to some LocalizeRunner for private localization. Example error {noformat} 2014-12-02 22:57:32,629 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Failed to download rsrc { { hdfs:/blah machine:8020/tmp/hive-hive/hive_2014-12-02_22-56-58_741_2045919883676051996-3/-mr-10004/8060c9dd-54b6-42fc-9d77-34b655fa5e82/reduce.xml, 1417589819618, FILE, null },pending,[(container_1417589109512_0001_02_03)],119413444132127,DOWNLOADING} java.io.FileNotFoundException: File /data/yarn/nm/filecache does not exist at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:524) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:737) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:514) at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:1051) at org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:162) at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:197) at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:724) at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:720) at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90) at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:720) at org.apache.hadoop.yarn.util.FSDownload.createDir(FSDownload.java:104) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:351) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 2014-12-02 22:57:32,629 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1417589109512_0001_02_03 transitioned from LOCALIZING to LOCALIZATION_FAILED {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2931) PublicLocalizer may fail with FileNotFoundException until directory gets initialized by LocalizeRunner
[ https://issues.apache.org/jira/browse/YARN-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14238941#comment-14238941 ] Hadoop QA commented on YARN-2931: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12685928/YARN-2931.004.patch against trunk revision ddffcd8. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 2 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6048//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6048//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-nodemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6048//console This message is automatically generated. PublicLocalizer may fail with FileNotFoundException until directory gets initialized by LocalizeRunner -- Key: YARN-2931 URL: https://issues.apache.org/jira/browse/YARN-2931 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Priority: Critical Attachments: YARN-2931.001.patch, YARN-2931.002.patch, YARN-2931.002.patch, YARN-2931.003.patch, YARN-2931.004.patch When the data directory is cleaned up and NM is started with existing recovery state, because of YARN-90, it will not recreate the local dirs. This causes a PublicLocalizer to fail until getInitializedLocalDirs is called due to some LocalizeRunner for private localization. Example error {noformat} 2014-12-02 22:57:32,629 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Failed to download rsrc { { hdfs:/blah machine:8020/tmp/hive-hive/hive_2014-12-02_22-56-58_741_2045919883676051996-3/-mr-10004/8060c9dd-54b6-42fc-9d77-34b655fa5e82/reduce.xml, 1417589819618, FILE, null },pending,[(container_1417589109512_0001_02_03)],119413444132127,DOWNLOADING} java.io.FileNotFoundException: File /data/yarn/nm/filecache does not exist at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:524) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:737) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:514) at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:1051) at org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:162) at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:197) at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:724) at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:720) at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90) at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:720) at org.apache.hadoop.yarn.util.FSDownload.createDir(FSDownload.java:104) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:351) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 2014-12-02 22:57:32,629 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1417589109512_0001_02_03 transitioned from LOCALIZING to LOCALIZATION_FAILED {noformat} -- This message was sent by Atlassian JIRA
[jira] [Commented] (YARN-2931) PublicLocalizer may fail with FileNotFoundException until directory gets initialized by LocalizeRunner
[ https://issues.apache.org/jira/browse/YARN-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14238944#comment-14238944 ] Hadoop QA commented on YARN-2931: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12685928/YARN-2931.004.patch against trunk revision ddffcd8. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 2 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6049//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6049//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-nodemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6049//console This message is automatically generated. PublicLocalizer may fail with FileNotFoundException until directory gets initialized by LocalizeRunner -- Key: YARN-2931 URL: https://issues.apache.org/jira/browse/YARN-2931 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Priority: Critical Attachments: YARN-2931.001.patch, YARN-2931.002.patch, YARN-2931.002.patch, YARN-2931.003.patch, YARN-2931.004.patch When the data directory is cleaned up and NM is started with existing recovery state, because of YARN-90, it will not recreate the local dirs. This causes a PublicLocalizer to fail until getInitializedLocalDirs is called due to some LocalizeRunner for private localization. Example error {noformat} 2014-12-02 22:57:32,629 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Failed to download rsrc { { hdfs:/blah machine:8020/tmp/hive-hive/hive_2014-12-02_22-56-58_741_2045919883676051996-3/-mr-10004/8060c9dd-54b6-42fc-9d77-34b655fa5e82/reduce.xml, 1417589819618, FILE, null },pending,[(container_1417589109512_0001_02_03)],119413444132127,DOWNLOADING} java.io.FileNotFoundException: File /data/yarn/nm/filecache does not exist at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:524) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:737) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:514) at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:1051) at org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:162) at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:197) at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:724) at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:720) at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90) at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:720) at org.apache.hadoop.yarn.util.FSDownload.createDir(FSDownload.java:104) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:351) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 2014-12-02 22:57:32,629 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1417589109512_0001_02_03 transitioned from LOCALIZING to LOCALIZATION_FAILED {noformat} -- This message was sent by Atlassian JIRA
[jira] [Commented] (YARN-2931) PublicLocalizer may fail with FileNotFoundException until directory gets initialized by LocalizeRunner
[ https://issues.apache.org/jira/browse/YARN-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14238976#comment-14238976 ] Hadoop QA commented on YARN-2931: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12685935/YARN-2931.005.patch against trunk revision ddffcd8. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6051//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6051//console This message is automatically generated. PublicLocalizer may fail with FileNotFoundException until directory gets initialized by LocalizeRunner -- Key: YARN-2931 URL: https://issues.apache.org/jira/browse/YARN-2931 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Priority: Critical Attachments: YARN-2931.001.patch, YARN-2931.002.patch, YARN-2931.002.patch, YARN-2931.003.patch, YARN-2931.004.patch, YARN-2931.005.patch When the data directory is cleaned up and NM is started with existing recovery state, because of YARN-90, it will not recreate the local dirs. This causes a PublicLocalizer to fail until getInitializedLocalDirs is called due to some LocalizeRunner for private localization. Example error {noformat} 2014-12-02 22:57:32,629 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Failed to download rsrc { { hdfs:/blah machine:8020/tmp/hive-hive/hive_2014-12-02_22-56-58_741_2045919883676051996-3/-mr-10004/8060c9dd-54b6-42fc-9d77-34b655fa5e82/reduce.xml, 1417589819618, FILE, null },pending,[(container_1417589109512_0001_02_03)],119413444132127,DOWNLOADING} java.io.FileNotFoundException: File /data/yarn/nm/filecache does not exist at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:524) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:737) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:514) at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:1051) at org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:162) at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:197) at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:724) at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:720) at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90) at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:720) at org.apache.hadoop.yarn.util.FSDownload.createDir(FSDownload.java:104) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:351) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 2014-12-02 22:57:32,629 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1417589109512_0001_02_03 transitioned from LOCALIZING to LOCALIZATION_FAILED {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2931) PublicLocalizer may fail with FileNotFoundException until directory gets initialized by LocalizeRunner
[ https://issues.apache.org/jira/browse/YARN-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14239032#comment-14239032 ] Karthik Kambatla commented on YARN-2931: +1 PublicLocalizer may fail with FileNotFoundException until directory gets initialized by LocalizeRunner -- Key: YARN-2931 URL: https://issues.apache.org/jira/browse/YARN-2931 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Priority: Critical Attachments: YARN-2931.001.patch, YARN-2931.002.patch, YARN-2931.002.patch, YARN-2931.003.patch, YARN-2931.004.patch, YARN-2931.005.patch When the data directory is cleaned up and NM is started with existing recovery state, because of YARN-90, it will not recreate the local dirs. This causes a PublicLocalizer to fail until getInitializedLocalDirs is called due to some LocalizeRunner for private localization. Example error {noformat} 2014-12-02 22:57:32,629 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Failed to download rsrc { { hdfs:/blah machine:8020/tmp/hive-hive/hive_2014-12-02_22-56-58_741_2045919883676051996-3/-mr-10004/8060c9dd-54b6-42fc-9d77-34b655fa5e82/reduce.xml, 1417589819618, FILE, null },pending,[(container_1417589109512_0001_02_03)],119413444132127,DOWNLOADING} java.io.FileNotFoundException: File /data/yarn/nm/filecache does not exist at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:524) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:737) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:514) at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:1051) at org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:162) at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:197) at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:724) at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:720) at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90) at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:720) at org.apache.hadoop.yarn.util.FSDownload.createDir(FSDownload.java:104) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:351) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 2014-12-02 22:57:32,629 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1417589109512_0001_02_03 transitioned from LOCALIZING to LOCALIZATION_FAILED {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)