[jira] [Updated] (YARN-2566) IOException happen in startLocalizer of DefaultContainerExecutor due to not enough disk space for the first localDir.

2014-10-12 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-2566:

Attachment: YARN-2566.008.patch

 IOException happen in startLocalizer of DefaultContainerExecutor due to not 
 enough disk space for the first localDir.
 -

 Key: YARN-2566
 URL: https://issues.apache.org/jira/browse/YARN-2566
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.5.0
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Critical
 Attachments: YARN-2566.000.patch, YARN-2566.001.patch, 
 YARN-2566.002.patch, YARN-2566.003.patch, YARN-2566.004.patch, 
 YARN-2566.005.patch, YARN-2566.006.patch, YARN-2566.007.patch, 
 YARN-2566.008.patch


 startLocalizer in DefaultContainerExecutor will only use the first localDir 
 to copy the token file, if the copy is failed for first localDir due to not 
 enough disk space in the first localDir, the localization will be failed even 
 there are plenty of disk space in other localDirs. We see the following error 
 for this case:
 {code}
 2014-09-13 23:33:25,171 WARN 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Unable to 
 create app directory 
 /hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004
 java.io.IOException: mkdir of 
 /hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004 failed
   at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:1062)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:157)
   at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:721)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:717)
   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
   at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:717)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createDir(DefaultContainerExecutor.java:426)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createAppDirs(DefaultContainerExecutor.java:522)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:94)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:987)
 2014-09-13 23:33:25,185 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
  Localizer failed
 java.io.FileNotFoundException: File 
 file:/hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004 
 does not exist
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:511)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.getFileStatus(DelegateToFileSystem.java:111)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.createInternal(DelegateToFileSystem.java:76)
   at 
 org.apache.hadoop.fs.ChecksumFs$ChecksumFSOutputSummer.init(ChecksumFs.java:344)
   at org.apache.hadoop.fs.ChecksumFs.createInternal(ChecksumFs.java:390)
   at 
 org.apache.hadoop.fs.AbstractFileSystem.create(AbstractFileSystem.java:577)
   at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:677)
   at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:673)
   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
   at org.apache.hadoop.fs.FileContext.create(FileContext.java:673)
   at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:2021)
   at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:1963)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:102)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:987)
 2014-09-13 23:33:25,186 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
  Container container_1410663092546_0004_01_01 transitioned from 
 LOCALIZING to LOCALIZATION_FAILED
 2014-09-13 23:33:25,187 WARN 
 org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=cloudera   
 OPERATION=Container Finished - Failed   TARGET=ContainerImpl
 RESULT=FAILURE  DESCRIPTION=Container failed with state: LOCALIZATION_FAILED  
   

[jira] [Updated] (YARN-2566) IOException happen in startLocalizer of DefaultContainerExecutor due to not enough disk space for the first localDir.

2014-10-11 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-2566:

Attachment: YARN-2566.007.patch

 IOException happen in startLocalizer of DefaultContainerExecutor due to not 
 enough disk space for the first localDir.
 -

 Key: YARN-2566
 URL: https://issues.apache.org/jira/browse/YARN-2566
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.5.0
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Critical
 Attachments: YARN-2566.000.patch, YARN-2566.001.patch, 
 YARN-2566.002.patch, YARN-2566.003.patch, YARN-2566.004.patch, 
 YARN-2566.005.patch, YARN-2566.006.patch, YARN-2566.007.patch


 startLocalizer in DefaultContainerExecutor will only use the first localDir 
 to copy the token file, if the copy is failed for first localDir due to not 
 enough disk space in the first localDir, the localization will be failed even 
 there are plenty of disk space in other localDirs. We see the following error 
 for this case:
 {code}
 2014-09-13 23:33:25,171 WARN 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Unable to 
 create app directory 
 /hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004
 java.io.IOException: mkdir of 
 /hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004 failed
   at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:1062)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:157)
   at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:721)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:717)
   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
   at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:717)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createDir(DefaultContainerExecutor.java:426)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createAppDirs(DefaultContainerExecutor.java:522)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:94)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:987)
 2014-09-13 23:33:25,185 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
  Localizer failed
 java.io.FileNotFoundException: File 
 file:/hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004 
 does not exist
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:511)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.getFileStatus(DelegateToFileSystem.java:111)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.createInternal(DelegateToFileSystem.java:76)
   at 
 org.apache.hadoop.fs.ChecksumFs$ChecksumFSOutputSummer.init(ChecksumFs.java:344)
   at org.apache.hadoop.fs.ChecksumFs.createInternal(ChecksumFs.java:390)
   at 
 org.apache.hadoop.fs.AbstractFileSystem.create(AbstractFileSystem.java:577)
   at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:677)
   at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:673)
   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
   at org.apache.hadoop.fs.FileContext.create(FileContext.java:673)
   at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:2021)
   at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:1963)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:102)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:987)
 2014-09-13 23:33:25,186 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
  Container container_1410663092546_0004_01_01 transitioned from 
 LOCALIZING to LOCALIZATION_FAILED
 2014-09-13 23:33:25,187 WARN 
 org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=cloudera   
 OPERATION=Container Finished - Failed   TARGET=ContainerImpl
 RESULT=FAILURE  DESCRIPTION=Container failed with state: LOCALIZATION_FAILED  
   

[jira] [Updated] (YARN-2566) IOException happen in startLocalizer of DefaultContainerExecutor due to not enough disk space for the first localDir.

2014-10-10 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-2566:
---
Priority: Critical  (was: Major)
Target Version/s: 2.6.0

 IOException happen in startLocalizer of DefaultContainerExecutor due to not 
 enough disk space for the first localDir.
 -

 Key: YARN-2566
 URL: https://issues.apache.org/jira/browse/YARN-2566
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.5.0
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Critical
 Attachments: YARN-2566.000.patch, YARN-2566.001.patch, 
 YARN-2566.002.patch, YARN-2566.003.patch, YARN-2566.004.patch


 startLocalizer in DefaultContainerExecutor will only use the first localDir 
 to copy the token file, if the copy is failed for first localDir due to not 
 enough disk space in the first localDir, the localization will be failed even 
 there are plenty of disk space in other localDirs. We see the following error 
 for this case:
 {code}
 2014-09-13 23:33:25,171 WARN 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Unable to 
 create app directory 
 /hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004
 java.io.IOException: mkdir of 
 /hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004 failed
   at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:1062)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:157)
   at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:721)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:717)
   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
   at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:717)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createDir(DefaultContainerExecutor.java:426)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createAppDirs(DefaultContainerExecutor.java:522)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:94)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:987)
 2014-09-13 23:33:25,185 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
  Localizer failed
 java.io.FileNotFoundException: File 
 file:/hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004 
 does not exist
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:511)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.getFileStatus(DelegateToFileSystem.java:111)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.createInternal(DelegateToFileSystem.java:76)
   at 
 org.apache.hadoop.fs.ChecksumFs$ChecksumFSOutputSummer.init(ChecksumFs.java:344)
   at org.apache.hadoop.fs.ChecksumFs.createInternal(ChecksumFs.java:390)
   at 
 org.apache.hadoop.fs.AbstractFileSystem.create(AbstractFileSystem.java:577)
   at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:677)
   at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:673)
   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
   at org.apache.hadoop.fs.FileContext.create(FileContext.java:673)
   at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:2021)
   at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:1963)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:102)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:987)
 2014-09-13 23:33:25,186 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
  Container container_1410663092546_0004_01_01 transitioned from 
 LOCALIZING to LOCALIZATION_FAILED
 2014-09-13 23:33:25,187 WARN 
 org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=cloudera   
 OPERATION=Container Finished - Failed   TARGET=ContainerImpl
 RESULT=FAILURE  DESCRIPTION=Container failed with state: LOCALIZATION_FAILED  
   APPID=application_1410663092546_0004
 

[jira] [Updated] (YARN-2566) IOException happen in startLocalizer of DefaultContainerExecutor due to not enough disk space for the first localDir.

2014-10-10 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-2566:

Attachment: YARN-2566.005.patch

 IOException happen in startLocalizer of DefaultContainerExecutor due to not 
 enough disk space for the first localDir.
 -

 Key: YARN-2566
 URL: https://issues.apache.org/jira/browse/YARN-2566
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.5.0
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Critical
 Attachments: YARN-2566.000.patch, YARN-2566.001.patch, 
 YARN-2566.002.patch, YARN-2566.003.patch, YARN-2566.004.patch, 
 YARN-2566.005.patch


 startLocalizer in DefaultContainerExecutor will only use the first localDir 
 to copy the token file, if the copy is failed for first localDir due to not 
 enough disk space in the first localDir, the localization will be failed even 
 there are plenty of disk space in other localDirs. We see the following error 
 for this case:
 {code}
 2014-09-13 23:33:25,171 WARN 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Unable to 
 create app directory 
 /hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004
 java.io.IOException: mkdir of 
 /hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004 failed
   at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:1062)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:157)
   at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:721)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:717)
   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
   at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:717)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createDir(DefaultContainerExecutor.java:426)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createAppDirs(DefaultContainerExecutor.java:522)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:94)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:987)
 2014-09-13 23:33:25,185 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
  Localizer failed
 java.io.FileNotFoundException: File 
 file:/hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004 
 does not exist
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:511)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.getFileStatus(DelegateToFileSystem.java:111)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.createInternal(DelegateToFileSystem.java:76)
   at 
 org.apache.hadoop.fs.ChecksumFs$ChecksumFSOutputSummer.init(ChecksumFs.java:344)
   at org.apache.hadoop.fs.ChecksumFs.createInternal(ChecksumFs.java:390)
   at 
 org.apache.hadoop.fs.AbstractFileSystem.create(AbstractFileSystem.java:577)
   at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:677)
   at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:673)
   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
   at org.apache.hadoop.fs.FileContext.create(FileContext.java:673)
   at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:2021)
   at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:1963)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:102)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:987)
 2014-09-13 23:33:25,186 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
  Container container_1410663092546_0004_01_01 transitioned from 
 LOCALIZING to LOCALIZATION_FAILED
 2014-09-13 23:33:25,187 WARN 
 org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=cloudera   
 OPERATION=Container Finished - Failed   TARGET=ContainerImpl
 RESULT=FAILURE  DESCRIPTION=Container failed with state: LOCALIZATION_FAILED  
   APPID=application_1410663092546_0004
 

[jira] [Updated] (YARN-2566) IOException happen in startLocalizer of DefaultContainerExecutor due to not enough disk space for the first localDir.

2014-10-10 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-2566:

Attachment: YARN-2566.006.patch

 IOException happen in startLocalizer of DefaultContainerExecutor due to not 
 enough disk space for the first localDir.
 -

 Key: YARN-2566
 URL: https://issues.apache.org/jira/browse/YARN-2566
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.5.0
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Critical
 Attachments: YARN-2566.000.patch, YARN-2566.001.patch, 
 YARN-2566.002.patch, YARN-2566.003.patch, YARN-2566.004.patch, 
 YARN-2566.005.patch, YARN-2566.006.patch


 startLocalizer in DefaultContainerExecutor will only use the first localDir 
 to copy the token file, if the copy is failed for first localDir due to not 
 enough disk space in the first localDir, the localization will be failed even 
 there are plenty of disk space in other localDirs. We see the following error 
 for this case:
 {code}
 2014-09-13 23:33:25,171 WARN 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Unable to 
 create app directory 
 /hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004
 java.io.IOException: mkdir of 
 /hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004 failed
   at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:1062)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:157)
   at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:721)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:717)
   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
   at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:717)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createDir(DefaultContainerExecutor.java:426)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createAppDirs(DefaultContainerExecutor.java:522)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:94)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:987)
 2014-09-13 23:33:25,185 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
  Localizer failed
 java.io.FileNotFoundException: File 
 file:/hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004 
 does not exist
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:511)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.getFileStatus(DelegateToFileSystem.java:111)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.createInternal(DelegateToFileSystem.java:76)
   at 
 org.apache.hadoop.fs.ChecksumFs$ChecksumFSOutputSummer.init(ChecksumFs.java:344)
   at org.apache.hadoop.fs.ChecksumFs.createInternal(ChecksumFs.java:390)
   at 
 org.apache.hadoop.fs.AbstractFileSystem.create(AbstractFileSystem.java:577)
   at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:677)
   at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:673)
   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
   at org.apache.hadoop.fs.FileContext.create(FileContext.java:673)
   at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:2021)
   at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:1963)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:102)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:987)
 2014-09-13 23:33:25,186 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
  Container container_1410663092546_0004_01_01 transitioned from 
 LOCALIZING to LOCALIZATION_FAILED
 2014-09-13 23:33:25,187 WARN 
 org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=cloudera   
 OPERATION=Container Finished - Failed   TARGET=ContainerImpl
 RESULT=FAILURE  DESCRIPTION=Container failed with state: LOCALIZATION_FAILED  
   APPID=application_1410663092546_0004
 

[jira] [Updated] (YARN-2566) IOException happen in startLocalizer of DefaultContainerExecutor due to not enough disk space for the first localDir.

2014-10-08 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-2566:

Attachment: YARN-2566.004.patch

 IOException happen in startLocalizer of DefaultContainerExecutor due to not 
 enough disk space for the first localDir.
 -

 Key: YARN-2566
 URL: https://issues.apache.org/jira/browse/YARN-2566
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.5.0
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-2566.000.patch, YARN-2566.001.patch, 
 YARN-2566.002.patch, YARN-2566.003.patch, YARN-2566.004.patch


 startLocalizer in DefaultContainerExecutor will only use the first localDir 
 to copy the token file, if the copy is failed for first localDir due to not 
 enough disk space in the first localDir, the localization will be failed even 
 there are plenty of disk space in other localDirs. We see the following error 
 for this case:
 {code}
 2014-09-13 23:33:25,171 WARN 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Unable to 
 create app directory 
 /hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004
 java.io.IOException: mkdir of 
 /hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004 failed
   at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:1062)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:157)
   at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:721)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:717)
   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
   at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:717)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createDir(DefaultContainerExecutor.java:426)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createAppDirs(DefaultContainerExecutor.java:522)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:94)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:987)
 2014-09-13 23:33:25,185 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
  Localizer failed
 java.io.FileNotFoundException: File 
 file:/hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004 
 does not exist
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:511)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.getFileStatus(DelegateToFileSystem.java:111)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.createInternal(DelegateToFileSystem.java:76)
   at 
 org.apache.hadoop.fs.ChecksumFs$ChecksumFSOutputSummer.init(ChecksumFs.java:344)
   at org.apache.hadoop.fs.ChecksumFs.createInternal(ChecksumFs.java:390)
   at 
 org.apache.hadoop.fs.AbstractFileSystem.create(AbstractFileSystem.java:577)
   at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:677)
   at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:673)
   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
   at org.apache.hadoop.fs.FileContext.create(FileContext.java:673)
   at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:2021)
   at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:1963)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:102)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:987)
 2014-09-13 23:33:25,186 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
  Container container_1410663092546_0004_01_01 transitioned from 
 LOCALIZING to LOCALIZATION_FAILED
 2014-09-13 23:33:25,187 WARN 
 org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=cloudera   
 OPERATION=Container Finished - Failed   TARGET=ContainerImpl
 RESULT=FAILURE  DESCRIPTION=Container failed with state: LOCALIZATION_FAILED  
   APPID=application_1410663092546_0004
 CONTAINERID=container_1410663092546_0004_01_01
 2014-09-13 23:33:25,187 INFO 
 

[jira] [Updated] (YARN-2566) IOException happen in startLocalizer of DefaultContainerExecutor due to not enough disk space for the first localDir.

2014-10-07 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-2566:
--
Issue Type: Sub-task  (was: Bug)
Parent: YARN-91

 IOException happen in startLocalizer of DefaultContainerExecutor due to not 
 enough disk space for the first localDir.
 -

 Key: YARN-2566
 URL: https://issues.apache.org/jira/browse/YARN-2566
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.5.0
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-2566.000.patch, YARN-2566.001.patch, 
 YARN-2566.002.patch, YARN-2566.003.patch


 startLocalizer in DefaultContainerExecutor will only use the first localDir 
 to copy the token file, if the copy is failed for first localDir due to not 
 enough disk space in the first localDir, the localization will be failed even 
 there are plenty of disk space in other localDirs. We see the following error 
 for this case:
 {code}
 2014-09-13 23:33:25,171 WARN 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Unable to 
 create app directory 
 /hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004
 java.io.IOException: mkdir of 
 /hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004 failed
   at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:1062)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:157)
   at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:721)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:717)
   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
   at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:717)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createDir(DefaultContainerExecutor.java:426)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createAppDirs(DefaultContainerExecutor.java:522)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:94)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:987)
 2014-09-13 23:33:25,185 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
  Localizer failed
 java.io.FileNotFoundException: File 
 file:/hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004 
 does not exist
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:511)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.getFileStatus(DelegateToFileSystem.java:111)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.createInternal(DelegateToFileSystem.java:76)
   at 
 org.apache.hadoop.fs.ChecksumFs$ChecksumFSOutputSummer.init(ChecksumFs.java:344)
   at org.apache.hadoop.fs.ChecksumFs.createInternal(ChecksumFs.java:390)
   at 
 org.apache.hadoop.fs.AbstractFileSystem.create(AbstractFileSystem.java:577)
   at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:677)
   at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:673)
   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
   at org.apache.hadoop.fs.FileContext.create(FileContext.java:673)
   at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:2021)
   at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:1963)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:102)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:987)
 2014-09-13 23:33:25,186 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
  Container container_1410663092546_0004_01_01 transitioned from 
 LOCALIZING to LOCALIZATION_FAILED
 2014-09-13 23:33:25,187 WARN 
 org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=cloudera   
 OPERATION=Container Finished - Failed   TARGET=ContainerImpl
 RESULT=FAILURE  DESCRIPTION=Container failed with state: LOCALIZATION_FAILED  
   APPID=application_1410663092546_0004
 CONTAINERID=container_1410663092546_0004_01_01

[jira] [Updated] (YARN-2566) IOException happen in startLocalizer of DefaultContainerExecutor due to not enough disk space for the first localDir.

2014-10-06 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-2566:

Attachment: (was: YARN-2566.003.patch)

 IOException happen in startLocalizer of DefaultContainerExecutor due to not 
 enough disk space for the first localDir.
 -

 Key: YARN-2566
 URL: https://issues.apache.org/jira/browse/YARN-2566
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.5.0
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-2566.000.patch, YARN-2566.001.patch, 
 YARN-2566.002.patch, YARN-2566.003.patch


 startLocalizer in DefaultContainerExecutor will only use the first localDir 
 to copy the token file, if the copy is failed for first localDir due to not 
 enough disk space in the first localDir, the localization will be failed even 
 there are plenty of disk space in other localDirs. We see the following error 
 for this case:
 {code}
 2014-09-13 23:33:25,171 WARN 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Unable to 
 create app directory 
 /hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004
 java.io.IOException: mkdir of 
 /hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004 failed
   at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:1062)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:157)
   at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:721)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:717)
   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
   at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:717)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createDir(DefaultContainerExecutor.java:426)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createAppDirs(DefaultContainerExecutor.java:522)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:94)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:987)
 2014-09-13 23:33:25,185 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
  Localizer failed
 java.io.FileNotFoundException: File 
 file:/hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004 
 does not exist
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:511)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.getFileStatus(DelegateToFileSystem.java:111)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.createInternal(DelegateToFileSystem.java:76)
   at 
 org.apache.hadoop.fs.ChecksumFs$ChecksumFSOutputSummer.init(ChecksumFs.java:344)
   at org.apache.hadoop.fs.ChecksumFs.createInternal(ChecksumFs.java:390)
   at 
 org.apache.hadoop.fs.AbstractFileSystem.create(AbstractFileSystem.java:577)
   at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:677)
   at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:673)
   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
   at org.apache.hadoop.fs.FileContext.create(FileContext.java:673)
   at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:2021)
   at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:1963)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:102)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:987)
 2014-09-13 23:33:25,186 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
  Container container_1410663092546_0004_01_01 transitioned from 
 LOCALIZING to LOCALIZATION_FAILED
 2014-09-13 23:33:25,187 WARN 
 org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=cloudera   
 OPERATION=Container Finished - Failed   TARGET=ContainerImpl
 RESULT=FAILURE  DESCRIPTION=Container failed with state: LOCALIZATION_FAILED  
   APPID=application_1410663092546_0004
 CONTAINERID=container_1410663092546_0004_01_01
 2014-09-13 23:33:25,187 INFO 
 

[jira] [Updated] (YARN-2566) IOException happen in startLocalizer of DefaultContainerExecutor due to not enough disk space for the first localDir.

2014-10-06 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-2566:

Attachment: YARN-2566.003.patch

 IOException happen in startLocalizer of DefaultContainerExecutor due to not 
 enough disk space for the first localDir.
 -

 Key: YARN-2566
 URL: https://issues.apache.org/jira/browse/YARN-2566
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.5.0
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-2566.000.patch, YARN-2566.001.patch, 
 YARN-2566.002.patch, YARN-2566.003.patch


 startLocalizer in DefaultContainerExecutor will only use the first localDir 
 to copy the token file, if the copy is failed for first localDir due to not 
 enough disk space in the first localDir, the localization will be failed even 
 there are plenty of disk space in other localDirs. We see the following error 
 for this case:
 {code}
 2014-09-13 23:33:25,171 WARN 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Unable to 
 create app directory 
 /hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004
 java.io.IOException: mkdir of 
 /hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004 failed
   at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:1062)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:157)
   at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:721)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:717)
   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
   at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:717)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createDir(DefaultContainerExecutor.java:426)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createAppDirs(DefaultContainerExecutor.java:522)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:94)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:987)
 2014-09-13 23:33:25,185 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
  Localizer failed
 java.io.FileNotFoundException: File 
 file:/hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004 
 does not exist
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:511)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.getFileStatus(DelegateToFileSystem.java:111)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.createInternal(DelegateToFileSystem.java:76)
   at 
 org.apache.hadoop.fs.ChecksumFs$ChecksumFSOutputSummer.init(ChecksumFs.java:344)
   at org.apache.hadoop.fs.ChecksumFs.createInternal(ChecksumFs.java:390)
   at 
 org.apache.hadoop.fs.AbstractFileSystem.create(AbstractFileSystem.java:577)
   at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:677)
   at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:673)
   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
   at org.apache.hadoop.fs.FileContext.create(FileContext.java:673)
   at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:2021)
   at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:1963)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:102)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:987)
 2014-09-13 23:33:25,186 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
  Container container_1410663092546_0004_01_01 transitioned from 
 LOCALIZING to LOCALIZATION_FAILED
 2014-09-13 23:33:25,187 WARN 
 org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=cloudera   
 OPERATION=Container Finished - Failed   TARGET=ContainerImpl
 RESULT=FAILURE  DESCRIPTION=Container failed with state: LOCALIZATION_FAILED  
   APPID=application_1410663092546_0004
 CONTAINERID=container_1410663092546_0004_01_01
 2014-09-13 23:33:25,187 INFO 
 

[jira] [Updated] (YARN-2566) IOException happen in startLocalizer of DefaultContainerExecutor due to not enough disk space for the first localDir.

2014-10-04 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-2566:

Attachment: (was: YARN-2566.002.patch)

 IOException happen in startLocalizer of DefaultContainerExecutor due to not 
 enough disk space for the first localDir.
 -

 Key: YARN-2566
 URL: https://issues.apache.org/jira/browse/YARN-2566
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.5.0
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-2566.000.patch, YARN-2566.001.patch


 startLocalizer in DefaultContainerExecutor will only use the first localDir 
 to copy the token file, if the copy is failed for first localDir due to not 
 enough disk space in the first localDir, the localization will be failed even 
 there are plenty of disk space in other localDirs. We see the following error 
 for this case:
 {code}
 2014-09-13 23:33:25,171 WARN 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Unable to 
 create app directory 
 /hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004
 java.io.IOException: mkdir of 
 /hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004 failed
   at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:1062)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:157)
   at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:721)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:717)
   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
   at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:717)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createDir(DefaultContainerExecutor.java:426)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createAppDirs(DefaultContainerExecutor.java:522)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:94)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:987)
 2014-09-13 23:33:25,185 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
  Localizer failed
 java.io.FileNotFoundException: File 
 file:/hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004 
 does not exist
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:511)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.getFileStatus(DelegateToFileSystem.java:111)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.createInternal(DelegateToFileSystem.java:76)
   at 
 org.apache.hadoop.fs.ChecksumFs$ChecksumFSOutputSummer.init(ChecksumFs.java:344)
   at org.apache.hadoop.fs.ChecksumFs.createInternal(ChecksumFs.java:390)
   at 
 org.apache.hadoop.fs.AbstractFileSystem.create(AbstractFileSystem.java:577)
   at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:677)
   at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:673)
   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
   at org.apache.hadoop.fs.FileContext.create(FileContext.java:673)
   at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:2021)
   at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:1963)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:102)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:987)
 2014-09-13 23:33:25,186 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
  Container container_1410663092546_0004_01_01 transitioned from 
 LOCALIZING to LOCALIZATION_FAILED
 2014-09-13 23:33:25,187 WARN 
 org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=cloudera   
 OPERATION=Container Finished - Failed   TARGET=ContainerImpl
 RESULT=FAILURE  DESCRIPTION=Container failed with state: LOCALIZATION_FAILED  
   APPID=application_1410663092546_0004
 CONTAINERID=container_1410663092546_0004_01_01
 2014-09-13 23:33:25,187 INFO 
 

[jira] [Updated] (YARN-2566) IOException happen in startLocalizer of DefaultContainerExecutor due to not enough disk space for the first localDir.

2014-10-04 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-2566:

Attachment: YARN-2566.002.patch

 IOException happen in startLocalizer of DefaultContainerExecutor due to not 
 enough disk space for the first localDir.
 -

 Key: YARN-2566
 URL: https://issues.apache.org/jira/browse/YARN-2566
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.5.0
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-2566.000.patch, YARN-2566.001.patch, 
 YARN-2566.002.patch


 startLocalizer in DefaultContainerExecutor will only use the first localDir 
 to copy the token file, if the copy is failed for first localDir due to not 
 enough disk space in the first localDir, the localization will be failed even 
 there are plenty of disk space in other localDirs. We see the following error 
 for this case:
 {code}
 2014-09-13 23:33:25,171 WARN 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Unable to 
 create app directory 
 /hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004
 java.io.IOException: mkdir of 
 /hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004 failed
   at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:1062)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:157)
   at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:721)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:717)
   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
   at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:717)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createDir(DefaultContainerExecutor.java:426)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createAppDirs(DefaultContainerExecutor.java:522)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:94)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:987)
 2014-09-13 23:33:25,185 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
  Localizer failed
 java.io.FileNotFoundException: File 
 file:/hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004 
 does not exist
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:511)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.getFileStatus(DelegateToFileSystem.java:111)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.createInternal(DelegateToFileSystem.java:76)
   at 
 org.apache.hadoop.fs.ChecksumFs$ChecksumFSOutputSummer.init(ChecksumFs.java:344)
   at org.apache.hadoop.fs.ChecksumFs.createInternal(ChecksumFs.java:390)
   at 
 org.apache.hadoop.fs.AbstractFileSystem.create(AbstractFileSystem.java:577)
   at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:677)
   at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:673)
   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
   at org.apache.hadoop.fs.FileContext.create(FileContext.java:673)
   at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:2021)
   at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:1963)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:102)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:987)
 2014-09-13 23:33:25,186 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
  Container container_1410663092546_0004_01_01 transitioned from 
 LOCALIZING to LOCALIZATION_FAILED
 2014-09-13 23:33:25,187 WARN 
 org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=cloudera   
 OPERATION=Container Finished - Failed   TARGET=ContainerImpl
 RESULT=FAILURE  DESCRIPTION=Container failed with state: LOCALIZATION_FAILED  
   APPID=application_1410663092546_0004
 CONTAINERID=container_1410663092546_0004_01_01
 2014-09-13 23:33:25,187 INFO 
 

[jira] [Updated] (YARN-2566) IOException happen in startLocalizer of DefaultContainerExecutor due to not enough disk space for the first localDir.

2014-10-04 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-2566:

Attachment: YARN-2566.003.patch

 IOException happen in startLocalizer of DefaultContainerExecutor due to not 
 enough disk space for the first localDir.
 -

 Key: YARN-2566
 URL: https://issues.apache.org/jira/browse/YARN-2566
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.5.0
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-2566.000.patch, YARN-2566.001.patch, 
 YARN-2566.002.patch, YARN-2566.003.patch


 startLocalizer in DefaultContainerExecutor will only use the first localDir 
 to copy the token file, if the copy is failed for first localDir due to not 
 enough disk space in the first localDir, the localization will be failed even 
 there are plenty of disk space in other localDirs. We see the following error 
 for this case:
 {code}
 2014-09-13 23:33:25,171 WARN 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Unable to 
 create app directory 
 /hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004
 java.io.IOException: mkdir of 
 /hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004 failed
   at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:1062)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:157)
   at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:721)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:717)
   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
   at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:717)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createDir(DefaultContainerExecutor.java:426)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createAppDirs(DefaultContainerExecutor.java:522)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:94)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:987)
 2014-09-13 23:33:25,185 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
  Localizer failed
 java.io.FileNotFoundException: File 
 file:/hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004 
 does not exist
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:511)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.getFileStatus(DelegateToFileSystem.java:111)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.createInternal(DelegateToFileSystem.java:76)
   at 
 org.apache.hadoop.fs.ChecksumFs$ChecksumFSOutputSummer.init(ChecksumFs.java:344)
   at org.apache.hadoop.fs.ChecksumFs.createInternal(ChecksumFs.java:390)
   at 
 org.apache.hadoop.fs.AbstractFileSystem.create(AbstractFileSystem.java:577)
   at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:677)
   at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:673)
   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
   at org.apache.hadoop.fs.FileContext.create(FileContext.java:673)
   at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:2021)
   at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:1963)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:102)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:987)
 2014-09-13 23:33:25,186 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
  Container container_1410663092546_0004_01_01 transitioned from 
 LOCALIZING to LOCALIZATION_FAILED
 2014-09-13 23:33:25,187 WARN 
 org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=cloudera   
 OPERATION=Container Finished - Failed   TARGET=ContainerImpl
 RESULT=FAILURE  DESCRIPTION=Container failed with state: LOCALIZATION_FAILED  
   APPID=application_1410663092546_0004
 CONTAINERID=container_1410663092546_0004_01_01
 2014-09-13 23:33:25,187 INFO 
 

[jira] [Updated] (YARN-2566) IOException happen in startLocalizer of DefaultContainerExecutor due to not enough disk space for the first localDir.

2014-10-03 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-2566:

Attachment: YARN-2566.002.patch

 IOException happen in startLocalizer of DefaultContainerExecutor due to not 
 enough disk space for the first localDir.
 -

 Key: YARN-2566
 URL: https://issues.apache.org/jira/browse/YARN-2566
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.5.0
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-2566.000.patch, YARN-2566.001.patch, 
 YARN-2566.002.patch


 startLocalizer in DefaultContainerExecutor will only use the first localDir 
 to copy the token file, if the copy is failed for first localDir due to not 
 enough disk space in the first localDir, the localization will be failed even 
 there are plenty of disk space in other localDirs. We see the following error 
 for this case:
 {code}
 2014-09-13 23:33:25,171 WARN 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Unable to 
 create app directory 
 /hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004
 java.io.IOException: mkdir of 
 /hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004 failed
   at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:1062)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:157)
   at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:721)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:717)
   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
   at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:717)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createDir(DefaultContainerExecutor.java:426)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createAppDirs(DefaultContainerExecutor.java:522)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:94)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:987)
 2014-09-13 23:33:25,185 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
  Localizer failed
 java.io.FileNotFoundException: File 
 file:/hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004 
 does not exist
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:511)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.getFileStatus(DelegateToFileSystem.java:111)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.createInternal(DelegateToFileSystem.java:76)
   at 
 org.apache.hadoop.fs.ChecksumFs$ChecksumFSOutputSummer.init(ChecksumFs.java:344)
   at org.apache.hadoop.fs.ChecksumFs.createInternal(ChecksumFs.java:390)
   at 
 org.apache.hadoop.fs.AbstractFileSystem.create(AbstractFileSystem.java:577)
   at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:677)
   at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:673)
   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
   at org.apache.hadoop.fs.FileContext.create(FileContext.java:673)
   at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:2021)
   at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:1963)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:102)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:987)
 2014-09-13 23:33:25,186 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
  Container container_1410663092546_0004_01_01 transitioned from 
 LOCALIZING to LOCALIZATION_FAILED
 2014-09-13 23:33:25,187 WARN 
 org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=cloudera   
 OPERATION=Container Finished - Failed   TARGET=ContainerImpl
 RESULT=FAILURE  DESCRIPTION=Container failed with state: LOCALIZATION_FAILED  
   APPID=application_1410663092546_0004
 CONTAINERID=container_1410663092546_0004_01_01
 2014-09-13 23:33:25,187 INFO 
 

[jira] [Updated] (YARN-2566) IOException happen in startLocalizer of DefaultContainerExecutor due to not enough disk space for the first localDir.

2014-09-26 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-2566:

Attachment: (was: YARN-2566.000.patch)

 IOException happen in startLocalizer of DefaultContainerExecutor due to not 
 enough disk space for the first localDir.
 -

 Key: YARN-2566
 URL: https://issues.apache.org/jira/browse/YARN-2566
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.5.0
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-2566.000.patch


 startLocalizer in DefaultContainerExecutor will only use the first localDir 
 to copy the token file, if the copy is failed for first localDir due to not 
 enough disk space in the first localDir, the localization will be failed even 
 there are plenty of disk space in other localDirs. We see the following error 
 for this case:
 {code}
 2014-09-13 23:33:25,171 WARN 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Unable to 
 create app directory 
 /hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004
 java.io.IOException: mkdir of 
 /hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004 failed
   at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:1062)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:157)
   at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:721)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:717)
   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
   at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:717)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createDir(DefaultContainerExecutor.java:426)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createAppDirs(DefaultContainerExecutor.java:522)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:94)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:987)
 2014-09-13 23:33:25,185 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
  Localizer failed
 java.io.FileNotFoundException: File 
 file:/hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004 
 does not exist
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:511)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.getFileStatus(DelegateToFileSystem.java:111)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.createInternal(DelegateToFileSystem.java:76)
   at 
 org.apache.hadoop.fs.ChecksumFs$ChecksumFSOutputSummer.init(ChecksumFs.java:344)
   at org.apache.hadoop.fs.ChecksumFs.createInternal(ChecksumFs.java:390)
   at 
 org.apache.hadoop.fs.AbstractFileSystem.create(AbstractFileSystem.java:577)
   at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:677)
   at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:673)
   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
   at org.apache.hadoop.fs.FileContext.create(FileContext.java:673)
   at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:2021)
   at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:1963)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:102)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:987)
 2014-09-13 23:33:25,186 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
  Container container_1410663092546_0004_01_01 transitioned from 
 LOCALIZING to LOCALIZATION_FAILED
 2014-09-13 23:33:25,187 WARN 
 org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=cloudera   
 OPERATION=Container Finished - Failed   TARGET=ContainerImpl
 RESULT=FAILURE  DESCRIPTION=Container failed with state: LOCALIZATION_FAILED  
   APPID=application_1410663092546_0004
 CONTAINERID=container_1410663092546_0004_01_01
 2014-09-13 23:33:25,187 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:

[jira] [Updated] (YARN-2566) IOException happen in startLocalizer of DefaultContainerExecutor due to not enough disk space for the first localDir.

2014-09-26 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-2566:

Attachment: YARN-2566.000.patch

 IOException happen in startLocalizer of DefaultContainerExecutor due to not 
 enough disk space for the first localDir.
 -

 Key: YARN-2566
 URL: https://issues.apache.org/jira/browse/YARN-2566
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.5.0
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-2566.000.patch


 startLocalizer in DefaultContainerExecutor will only use the first localDir 
 to copy the token file, if the copy is failed for first localDir due to not 
 enough disk space in the first localDir, the localization will be failed even 
 there are plenty of disk space in other localDirs. We see the following error 
 for this case:
 {code}
 2014-09-13 23:33:25,171 WARN 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Unable to 
 create app directory 
 /hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004
 java.io.IOException: mkdir of 
 /hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004 failed
   at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:1062)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:157)
   at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:721)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:717)
   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
   at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:717)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createDir(DefaultContainerExecutor.java:426)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createAppDirs(DefaultContainerExecutor.java:522)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:94)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:987)
 2014-09-13 23:33:25,185 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
  Localizer failed
 java.io.FileNotFoundException: File 
 file:/hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004 
 does not exist
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:511)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.getFileStatus(DelegateToFileSystem.java:111)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.createInternal(DelegateToFileSystem.java:76)
   at 
 org.apache.hadoop.fs.ChecksumFs$ChecksumFSOutputSummer.init(ChecksumFs.java:344)
   at org.apache.hadoop.fs.ChecksumFs.createInternal(ChecksumFs.java:390)
   at 
 org.apache.hadoop.fs.AbstractFileSystem.create(AbstractFileSystem.java:577)
   at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:677)
   at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:673)
   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
   at org.apache.hadoop.fs.FileContext.create(FileContext.java:673)
   at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:2021)
   at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:1963)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:102)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:987)
 2014-09-13 23:33:25,186 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
  Container container_1410663092546_0004_01_01 transitioned from 
 LOCALIZING to LOCALIZATION_FAILED
 2014-09-13 23:33:25,187 WARN 
 org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=cloudera   
 OPERATION=Container Finished - Failed   TARGET=ContainerImpl
 RESULT=FAILURE  DESCRIPTION=Container failed with state: LOCALIZATION_FAILED  
   APPID=application_1410663092546_0004
 CONTAINERID=container_1410663092546_0004_01_01
 2014-09-13 23:33:25,187 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
  

[jira] [Updated] (YARN-2566) IOException happen in startLocalizer of DefaultContainerExecutor due to not enough disk space for the first localDir.

2014-09-26 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-2566:

Attachment: YARN-2566.001.patch

 IOException happen in startLocalizer of DefaultContainerExecutor due to not 
 enough disk space for the first localDir.
 -

 Key: YARN-2566
 URL: https://issues.apache.org/jira/browse/YARN-2566
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.5.0
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-2566.000.patch, YARN-2566.001.patch


 startLocalizer in DefaultContainerExecutor will only use the first localDir 
 to copy the token file, if the copy is failed for first localDir due to not 
 enough disk space in the first localDir, the localization will be failed even 
 there are plenty of disk space in other localDirs. We see the following error 
 for this case:
 {code}
 2014-09-13 23:33:25,171 WARN 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Unable to 
 create app directory 
 /hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004
 java.io.IOException: mkdir of 
 /hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004 failed
   at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:1062)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:157)
   at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:721)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:717)
   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
   at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:717)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createDir(DefaultContainerExecutor.java:426)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createAppDirs(DefaultContainerExecutor.java:522)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:94)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:987)
 2014-09-13 23:33:25,185 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
  Localizer failed
 java.io.FileNotFoundException: File 
 file:/hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004 
 does not exist
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:511)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.getFileStatus(DelegateToFileSystem.java:111)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.createInternal(DelegateToFileSystem.java:76)
   at 
 org.apache.hadoop.fs.ChecksumFs$ChecksumFSOutputSummer.init(ChecksumFs.java:344)
   at org.apache.hadoop.fs.ChecksumFs.createInternal(ChecksumFs.java:390)
   at 
 org.apache.hadoop.fs.AbstractFileSystem.create(AbstractFileSystem.java:577)
   at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:677)
   at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:673)
   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
   at org.apache.hadoop.fs.FileContext.create(FileContext.java:673)
   at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:2021)
   at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:1963)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:102)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:987)
 2014-09-13 23:33:25,186 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
  Container container_1410663092546_0004_01_01 transitioned from 
 LOCALIZING to LOCALIZATION_FAILED
 2014-09-13 23:33:25,187 WARN 
 org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=cloudera   
 OPERATION=Container Finished - Failed   TARGET=ContainerImpl
 RESULT=FAILURE  DESCRIPTION=Container failed with state: LOCALIZATION_FAILED  
   APPID=application_1410663092546_0004
 CONTAINERID=container_1410663092546_0004_01_01
 2014-09-13 23:33:25,187 INFO 
 

[jira] [Updated] (YARN-2566) IOException happen in startLocalizer of DefaultContainerExecutor due to not enough disk space for the first localDir.

2014-09-18 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-2566:

Attachment: YARN-2566.000.patch

 IOException happen in startLocalizer of DefaultContainerExecutor due to not 
 enough disk space for the first localDir.
 -

 Key: YARN-2566
 URL: https://issues.apache.org/jira/browse/YARN-2566
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.5.0
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-2566.000.patch


 startLocalizer in DefaultContainerExecutor will only use the first localDir 
 to copy the token file, if the copy is failed for first localDir due to not 
 enough disk space in the first localDir, the localization will be failed even 
 there are plenty of disk space in other localDirs. We see the following error 
 for this case:
 {code}
 2014-09-13 23:33:25,171 WARN 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Unable to 
 create app directory 
 /hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004
 java.io.IOException: mkdir of 
 /hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004 failed
   at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:1062)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:157)
   at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:721)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:717)
   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
   at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:717)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createDir(DefaultContainerExecutor.java:426)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createAppDirs(DefaultContainerExecutor.java:522)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:94)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:987)
 2014-09-13 23:33:25,185 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
  Localizer failed
 java.io.FileNotFoundException: File 
 file:/hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004 
 does not exist
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:511)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.getFileStatus(DelegateToFileSystem.java:111)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.createInternal(DelegateToFileSystem.java:76)
   at 
 org.apache.hadoop.fs.ChecksumFs$ChecksumFSOutputSummer.init(ChecksumFs.java:344)
   at org.apache.hadoop.fs.ChecksumFs.createInternal(ChecksumFs.java:390)
   at 
 org.apache.hadoop.fs.AbstractFileSystem.create(AbstractFileSystem.java:577)
   at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:677)
   at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:673)
   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
   at org.apache.hadoop.fs.FileContext.create(FileContext.java:673)
   at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:2021)
   at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:1963)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:102)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:987)
 2014-09-13 23:33:25,186 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
  Container container_1410663092546_0004_01_01 transitioned from 
 LOCALIZING to LOCALIZATION_FAILED
 2014-09-13 23:33:25,187 WARN 
 org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=cloudera   
 OPERATION=Container Finished - Failed   TARGET=ContainerImpl
 RESULT=FAILURE  DESCRIPTION=Container failed with state: LOCALIZATION_FAILED  
   APPID=application_1410663092546_0004
 CONTAINERID=container_1410663092546_0004_01_01
 2014-09-13 23:33:25,187 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
  

[jira] [Updated] (YARN-2566) IOException happen in startLocalizer of DefaultContainerExecutor due to not enough disk space for the first localDir.

2014-09-17 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-2566:

Description: 
startLocalizer in DefaultContainerExecutor will only use the first localDir to 
copy the token file, if the copy is failed for first localDir due to not enough 
disk space in the first localDir, the localization will be failed even there 
are plenty of disk space in other localDirs. We see the following error for 
this case:
{code}
2014-09-13 23:33:25,171 WARN 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Unable to 
create app directory 
/hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004
java.io.IOException: mkdir of 
/hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004 failed
at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:1062)
at 
org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:157)
at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:721)
at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:717)
at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:717)
at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createDir(DefaultContainerExecutor.java:426)
at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createAppDirs(DefaultContainerExecutor.java:522)
at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:94)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:987)
2014-09-13 23:33:25,185 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
 Localizer failed
java.io.FileNotFoundException: File 
file:/hadoop/d1/usercache/cloudera/appcache/application_1410663092546_0004 does 
not exist
at 
org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:511)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501)
at 
org.apache.hadoop.fs.DelegateToFileSystem.getFileStatus(DelegateToFileSystem.java:111)
at 
org.apache.hadoop.fs.DelegateToFileSystem.createInternal(DelegateToFileSystem.java:76)
at 
org.apache.hadoop.fs.ChecksumFs$ChecksumFSOutputSummer.init(ChecksumFs.java:344)
at org.apache.hadoop.fs.ChecksumFs.createInternal(ChecksumFs.java:390)
at 
org.apache.hadoop.fs.AbstractFileSystem.create(AbstractFileSystem.java:577)
at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:677)
at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:673)
at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
at org.apache.hadoop.fs.FileContext.create(FileContext.java:673)
at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:2021)
at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:1963)
at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:102)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:987)
2014-09-13 23:33:25,186 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: 
Container container_1410663092546_0004_01_01 transitioned from LOCALIZING 
to LOCALIZATION_FAILED
2014-09-13 23:33:25,187 WARN 
org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=cloudera 
OPERATION=Container Finished - Failed   TARGET=ContainerImplRESULT=FAILURE  
DESCRIPTION=Container failed with state: LOCALIZATION_FAILED
APPID=application_1410663092546_0004
CONTAINERID=container_1410663092546_0004_01_01
2014-09-13 23:33:25,187 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: 
Container container_1410663092546_0004_01_01 transitioned from 
LOCALIZATION_FAILED to DONE
2014-09-13 23:33:25,187 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
 Removing container_1410663092546_0004_01_01 from application 
application_1410663092546_0004
2014-09-13 23:33:25,187 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl:
 Considering container container_1410663092546_0004_01_01 for 
log-aggregation
2014-09-13 23:33:25,187 INFO