[jira] [Commented] (YARN-467) Jobs fail during resource localization when public distributed-cache hits unix directory limits

2013-04-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13622338#comment-13622338
 ] 

Hudson commented on YARN-467:
-

Integrated in Hadoop-Mapreduce-trunk #1390 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1390/])
YARN-467. Modify public distributed cache to localize files such that no 
local directory hits unix file count limits and thus prevent job failures. 
Contributed by Omkar Vinit Joshi. (Revision 1463823)

 Result = SUCCESS
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1463823
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/LocalCacheDirectoryManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/LocalResourcesTracker.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/LocalResourcesTrackerImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestLocalCacheDirectoryManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestLocalResourcesTrackerImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceRetention.java


> Jobs fail during resource localization when public distributed-cache hits 
> unix directory limits
> ---
>
> Key: YARN-467
> URL: https://issues.apache.org/jira/browse/YARN-467
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0, 2.0.0-alpha
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
> Fix For: 2.0.5-beta
>
> Attachments: yarn-467-20130322.1.patch, yarn-467-20130322.2.patch, 
> yarn-467-20130322.3.patch, yarn-467-20130322.patch, 
> yarn-467-20130325.1.patch, yarn-467-20130325.path, yarn-467-20130328.patch, 
> yarn-467-20130401.patch, yarn-467-20130402.1.patch, 
> yarn-467-20130402.2.patch, yarn-467-20130402.patch, yarn-467-testCode.tar
>
>
> If we have multiple jobs which uses distributed cache with small size of 
> files, the directory limit reaches before reaching the cache size and fails 
> to create any directories in file cache (PUBLIC). The jobs start failing with 
> the below exception.
> java.io.IOException: mkdir of /tmp/nm-local-dir/filecache/3901886847734194975 
> failed
>   at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:909)
>   at 
> org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
>   at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
>   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:706)
>   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:703)
>   at 
> org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2325)
>   at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:703)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:147)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.Thread

[jira] [Commented] (YARN-467) Jobs fail during resource localization when public distributed-cache hits unix directory limits

2013-04-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13622144#comment-13622144
 ] 

Hudson commented on YARN-467:
-

Integrated in Hadoop-Hdfs-trunk #1363 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1363/])
YARN-467. Modify public distributed cache to localize files such that no 
local directory hits unix file count limits and thus prevent job failures. 
Contributed by Omkar Vinit Joshi. (Revision 1463823)

 Result = FAILURE
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1463823
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/LocalCacheDirectoryManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/LocalResourcesTracker.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/LocalResourcesTrackerImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestLocalCacheDirectoryManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestLocalResourcesTrackerImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceRetention.java


> Jobs fail during resource localization when public distributed-cache hits 
> unix directory limits
> ---
>
> Key: YARN-467
> URL: https://issues.apache.org/jira/browse/YARN-467
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0, 2.0.0-alpha
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
> Fix For: 2.0.5-beta
>
> Attachments: yarn-467-20130322.1.patch, yarn-467-20130322.2.patch, 
> yarn-467-20130322.3.patch, yarn-467-20130322.patch, 
> yarn-467-20130325.1.patch, yarn-467-20130325.path, yarn-467-20130328.patch, 
> yarn-467-20130401.patch, yarn-467-20130402.1.patch, 
> yarn-467-20130402.2.patch, yarn-467-20130402.patch, yarn-467-testCode.tar
>
>
> If we have multiple jobs which uses distributed cache with small size of 
> files, the directory limit reaches before reaching the cache size and fails 
> to create any directories in file cache (PUBLIC). The jobs start failing with 
> the below exception.
> java.io.IOException: mkdir of /tmp/nm-local-dir/filecache/3901886847734194975 
> failed
>   at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:909)
>   at 
> org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
>   at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
>   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:706)
>   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:703)
>   at 
> org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2325)
>   at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:703)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:147)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.ThreadPoolExecut

[jira] [Commented] (YARN-467) Jobs fail during resource localization when public distributed-cache hits unix directory limits

2013-04-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13622018#comment-13622018
 ] 

Hudson commented on YARN-467:
-

Integrated in Hadoop-Yarn-trunk #174 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/174/])
YARN-467. Modify public distributed cache to localize files such that no 
local directory hits unix file count limits and thus prevent job failures. 
Contributed by Omkar Vinit Joshi. (Revision 1463823)

 Result = SUCCESS
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1463823
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/LocalCacheDirectoryManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/LocalResourcesTracker.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/LocalResourcesTrackerImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestLocalCacheDirectoryManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestLocalResourcesTrackerImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceRetention.java


> Jobs fail during resource localization when public distributed-cache hits 
> unix directory limits
> ---
>
> Key: YARN-467
> URL: https://issues.apache.org/jira/browse/YARN-467
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0, 2.0.0-alpha
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
> Fix For: 2.0.5-beta
>
> Attachments: yarn-467-20130322.1.patch, yarn-467-20130322.2.patch, 
> yarn-467-20130322.3.patch, yarn-467-20130322.patch, 
> yarn-467-20130325.1.patch, yarn-467-20130325.path, yarn-467-20130328.patch, 
> yarn-467-20130401.patch, yarn-467-20130402.1.patch, 
> yarn-467-20130402.2.patch, yarn-467-20130402.patch, yarn-467-testCode.tar
>
>
> If we have multiple jobs which uses distributed cache with small size of 
> files, the directory limit reaches before reaching the cache size and fails 
> to create any directories in file cache (PUBLIC). The jobs start failing with 
> the below exception.
> java.io.IOException: mkdir of /tmp/nm-local-dir/filecache/3901886847734194975 
> failed
>   at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:909)
>   at 
> org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
>   at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
>   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:706)
>   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:703)
>   at 
> org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2325)
>   at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:703)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:147)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.ThreadPoolExecutor

[jira] [Commented] (YARN-467) Jobs fail during resource localization when public distributed-cache hits unix directory limits

2013-04-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13620617#comment-13620617
 ] 

Hudson commented on YARN-467:
-

Integrated in Hadoop-trunk-Commit #3552 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3552/])
YARN-467. Modify public distributed cache to localize files such that no 
local directory hits unix file count limits and thus prevent job failures. 
Contributed by Omkar Vinit Joshi. (Revision 1463823)

 Result = SUCCESS
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1463823
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/LocalCacheDirectoryManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/LocalResourcesTracker.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/LocalResourcesTrackerImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestLocalCacheDirectoryManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestLocalResourcesTrackerImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceRetention.java


> Jobs fail during resource localization when public distributed-cache hits 
> unix directory limits
> ---
>
> Key: YARN-467
> URL: https://issues.apache.org/jira/browse/YARN-467
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0, 2.0.0-alpha
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
> Fix For: 2.0.5-beta
>
> Attachments: yarn-467-20130322.1.patch, yarn-467-20130322.2.patch, 
> yarn-467-20130322.3.patch, yarn-467-20130322.patch, 
> yarn-467-20130325.1.patch, yarn-467-20130325.path, yarn-467-20130328.patch, 
> yarn-467-20130401.patch, yarn-467-20130402.1.patch, 
> yarn-467-20130402.2.patch, yarn-467-20130402.patch
>
>
> If we have multiple jobs which uses distributed cache with small size of 
> files, the directory limit reaches before reaching the cache size and fails 
> to create any directories in file cache (PUBLIC). The jobs start failing with 
> the below exception.
> java.io.IOException: mkdir of /tmp/nm-local-dir/filecache/3901886847734194975 
> failed
>   at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:909)
>   at 
> org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
>   at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
>   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:706)
>   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:703)
>   at 
> org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2325)
>   at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:703)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:147)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(T

[jira] [Commented] (YARN-467) Jobs fail during resource localization when public distributed-cache hits unix directory limits

2013-04-02 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13620608#comment-13620608
 ] 

Vinod Kumar Vavilapalli commented on YARN-467:
--

Perfect, the latest patch looks good. Checking it in.

> Jobs fail during resource localization when public distributed-cache hits 
> unix directory limits
> ---
>
> Key: YARN-467
> URL: https://issues.apache.org/jira/browse/YARN-467
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0, 2.0.0-alpha
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
> Attachments: yarn-467-20130322.1.patch, yarn-467-20130322.2.patch, 
> yarn-467-20130322.3.patch, yarn-467-20130322.patch, 
> yarn-467-20130325.1.patch, yarn-467-20130325.path, yarn-467-20130328.patch, 
> yarn-467-20130401.patch, yarn-467-20130402.1.patch, 
> yarn-467-20130402.2.patch, yarn-467-20130402.patch
>
>
> If we have multiple jobs which uses distributed cache with small size of 
> files, the directory limit reaches before reaching the cache size and fails 
> to create any directories in file cache (PUBLIC). The jobs start failing with 
> the below exception.
> java.io.IOException: mkdir of /tmp/nm-local-dir/filecache/3901886847734194975 
> failed
>   at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:909)
>   at 
> org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
>   at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
>   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:706)
>   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:703)
>   at 
> org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2325)
>   at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:703)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:147)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> we need to have a mechanism where in we can create directory hierarchy and 
> limit number of files per directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-467) Jobs fail during resource localization when public distributed-cache hits unix directory limits

2013-04-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13620546#comment-13620546
 ] 

Hadoop QA commented on YARN-467:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12576705/yarn-467-20130402.2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/657//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/657//console

This message is automatically generated.

> Jobs fail during resource localization when public distributed-cache hits 
> unix directory limits
> ---
>
> Key: YARN-467
> URL: https://issues.apache.org/jira/browse/YARN-467
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0, 2.0.0-alpha
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
> Attachments: yarn-467-20130322.1.patch, yarn-467-20130322.2.patch, 
> yarn-467-20130322.3.patch, yarn-467-20130322.patch, 
> yarn-467-20130325.1.patch, yarn-467-20130325.path, yarn-467-20130328.patch, 
> yarn-467-20130401.patch, yarn-467-20130402.1.patch, 
> yarn-467-20130402.2.patch, yarn-467-20130402.patch
>
>
> If we have multiple jobs which uses distributed cache with small size of 
> files, the directory limit reaches before reaching the cache size and fails 
> to create any directories in file cache (PUBLIC). The jobs start failing with 
> the below exception.
> java.io.IOException: mkdir of /tmp/nm-local-dir/filecache/3901886847734194975 
> failed
>   at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:909)
>   at 
> org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
>   at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
>   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:706)
>   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:703)
>   at 
> org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2325)
>   at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:703)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:147)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> we need to have a mechanism where in we can create directory hierarchy and 
> limit number of files per directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-467) Jobs fail during resource localization when public distributed-cache hits unix directory limits

2013-04-02 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13620535#comment-13620535
 ] 

Omkar Vinit Joshi commented on YARN-467:


I have tested this code for below scenarios
* I used 4 local-dirs to see if the localization gets distributed across them 
and LocalCacheDirectoryManager 
is managing them separately
* I tested for various values of 
"yarn.nodemanager.local-cache.max-files-per-directory" <=36, 37 , 40 and much 
larger..
* I modified the cache cleanup interval and cache target size in mb to see 
older files getting removed from cache and LocalCacheDirectoryManager's sub 
directories are getting reused.
* I tested that we never run into a situation where we have more number of 
files or sub directories in any local-directory than what is specified in the 
configuration.

> Jobs fail during resource localization when public distributed-cache hits 
> unix directory limits
> ---
>
> Key: YARN-467
> URL: https://issues.apache.org/jira/browse/YARN-467
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0, 2.0.0-alpha
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
> Attachments: yarn-467-20130322.1.patch, yarn-467-20130322.2.patch, 
> yarn-467-20130322.3.patch, yarn-467-20130322.patch, 
> yarn-467-20130325.1.patch, yarn-467-20130325.path, yarn-467-20130328.patch, 
> yarn-467-20130401.patch, yarn-467-20130402.1.patch, yarn-467-20130402.patch
>
>
> If we have multiple jobs which uses distributed cache with small size of 
> files, the directory limit reaches before reaching the cache size and fails 
> to create any directories in file cache (PUBLIC). The jobs start failing with 
> the below exception.
> java.io.IOException: mkdir of /tmp/nm-local-dir/filecache/3901886847734194975 
> failed
>   at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:909)
>   at 
> org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
>   at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
>   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:706)
>   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:703)
>   at 
> org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2325)
>   at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:703)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:147)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> we need to have a mechanism where in we can create directory hierarchy and 
> limit number of files per directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-467) Jobs fail during resource localization when public distributed-cache hits unix directory limits

2013-04-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13620412#comment-13620412
 ] 

Hadoop QA commented on YARN-467:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12576688/yarn-467-20130402.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/654//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/654//console

This message is automatically generated.

> Jobs fail during resource localization when public distributed-cache hits 
> unix directory limits
> ---
>
> Key: YARN-467
> URL: https://issues.apache.org/jira/browse/YARN-467
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0, 2.0.0-alpha
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
> Attachments: yarn-467-20130322.1.patch, yarn-467-20130322.2.patch, 
> yarn-467-20130322.3.patch, yarn-467-20130322.patch, 
> yarn-467-20130325.1.patch, yarn-467-20130325.path, yarn-467-20130328.patch, 
> yarn-467-20130401.patch, yarn-467-20130402.1.patch, yarn-467-20130402.patch
>
>
> If we have multiple jobs which uses distributed cache with small size of 
> files, the directory limit reaches before reaching the cache size and fails 
> to create any directories in file cache (PUBLIC). The jobs start failing with 
> the below exception.
> java.io.IOException: mkdir of /tmp/nm-local-dir/filecache/3901886847734194975 
> failed
>   at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:909)
>   at 
> org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
>   at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
>   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:706)
>   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:703)
>   at 
> org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2325)
>   at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:703)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:147)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> we need to have a mechanism where in we can create directory hierarchy and 
> limit number of files per directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-467) Jobs fail during resource localization when public distributed-cache hits unix directory limits

2013-04-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13620384#comment-13620384
 ] 

Hadoop QA commented on YARN-467:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12576681/yarn-467-20130402.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:

  
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestLocalResourcesTrackerImpl

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/652//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/652//console

This message is automatically generated.

> Jobs fail during resource localization when public distributed-cache hits 
> unix directory limits
> ---
>
> Key: YARN-467
> URL: https://issues.apache.org/jira/browse/YARN-467
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0, 2.0.0-alpha
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
> Attachments: yarn-467-20130322.1.patch, yarn-467-20130322.2.patch, 
> yarn-467-20130322.3.patch, yarn-467-20130322.patch, 
> yarn-467-20130325.1.patch, yarn-467-20130325.path, yarn-467-20130328.patch, 
> yarn-467-20130401.patch, yarn-467-20130402.patch
>
>
> If we have multiple jobs which uses distributed cache with small size of 
> files, the directory limit reaches before reaching the cache size and fails 
> to create any directories in file cache (PUBLIC). The jobs start failing with 
> the below exception.
> java.io.IOException: mkdir of /tmp/nm-local-dir/filecache/3901886847734194975 
> failed
>   at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:909)
>   at 
> org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
>   at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
>   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:706)
>   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:703)
>   at 
> org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2325)
>   at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:703)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:147)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> we need to have a mechanism where in we can create directory hierarchy and 
> limit number of files per directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-467) Jobs fail during resource localization when public distributed-cache hits unix directory limits

2013-04-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13619376#comment-13619376
 ] 

Hadoop QA commented on YARN-467:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12576466/yarn-467-20130401.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 2 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:

  
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestContainerLocalizer

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/641//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/641//console

This message is automatically generated.

> Jobs fail during resource localization when public distributed-cache hits 
> unix directory limits
> ---
>
> Key: YARN-467
> URL: https://issues.apache.org/jira/browse/YARN-467
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0, 2.0.0-alpha
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
> Attachments: yarn-467-20130322.1.patch, yarn-467-20130322.2.patch, 
> yarn-467-20130322.3.patch, yarn-467-20130322.patch, 
> yarn-467-20130325.1.patch, yarn-467-20130325.path, yarn-467-20130328.patch, 
> yarn-467-20130401.patch
>
>
> If we have multiple jobs which uses distributed cache with small size of 
> files, the directory limit reaches before reaching the cache size and fails 
> to create any directories in file cache (PUBLIC). The jobs start failing with 
> the below exception.
> java.io.IOException: mkdir of /tmp/nm-local-dir/filecache/3901886847734194975 
> failed
>   at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:909)
>   at 
> org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
>   at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
>   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:706)
>   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:703)
>   at 
> org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2325)
>   at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:703)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:147)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> we need to have a mechanism where in we can create directory hierarchy and 
> limit number of files per directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-467) Jobs fail during resource localization when public distributed-cache hits unix directory limits

2013-04-01 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13619304#comment-13619304
 ] 

Omkar Vinit Joshi commented on YARN-467:


I ran the test on Mac and got below results. I think keeping a default of 8192 
would be good..

||Total Number of files || Total time taken (in millis)||
||32||4||
||64||7||
||128||15||
||256||27||
||512||60||
||1024||120||
||2048||219||
||4096||524||
||8192||1845||
||16384||7332||

I have incorporated all the comments in the latest patch.

> Jobs fail during resource localization when public distributed-cache hits 
> unix directory limits
> ---
>
> Key: YARN-467
> URL: https://issues.apache.org/jira/browse/YARN-467
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0, 2.0.0-alpha
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
> Attachments: yarn-467-20130322.1.patch, yarn-467-20130322.2.patch, 
> yarn-467-20130322.3.patch, yarn-467-20130322.patch, 
> yarn-467-20130325.1.patch, yarn-467-20130325.path, yarn-467-20130328.patch
>
>
> If we have multiple jobs which uses distributed cache with small size of 
> files, the directory limit reaches before reaching the cache size and fails 
> to create any directories in file cache (PUBLIC). The jobs start failing with 
> the below exception.
> java.io.IOException: mkdir of /tmp/nm-local-dir/filecache/3901886847734194975 
> failed
>   at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:909)
>   at 
> org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
>   at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
>   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:706)
>   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:703)
>   at 
> org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2325)
>   at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:703)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:147)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> we need to have a mechanism where in we can create directory hierarchy and 
> limit number of files per directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-467) Jobs fail during resource localization when public distributed-cache hits unix directory limits

2013-04-01 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618997#comment-13618997
 ] 

Siddharth Seth commented on YARN-467:
-

bq. Another thing I've been looking hard is to see if 
LocalResourceTracker.localizationCompleted() can be done away with completely 
in favour of the handle() method. But to do that we need to handle both 
successful and failing localizations via handle(). I can already see a couple 
of bugs related to localization failures, so let's do this separately.
That could be the route to reach the LocalizedResources, instaed of sending 
events to them directly.  IAC, can be figured out in the follow-up jiras.

Had looked at this patch earlier as well; mostly looks good in terms of 
functionality. It was a little tough to read, hopefully some of the changes 
suggested by Vinod will make that easier. 

> Jobs fail during resource localization when public distributed-cache hits 
> unix directory limits
> ---
>
> Key: YARN-467
> URL: https://issues.apache.org/jira/browse/YARN-467
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0, 2.0.0-alpha
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
> Attachments: yarn-467-20130322.1.patch, yarn-467-20130322.2.patch, 
> yarn-467-20130322.3.patch, yarn-467-20130322.patch, 
> yarn-467-20130325.1.patch, yarn-467-20130325.path, yarn-467-20130328.patch
>
>
> If we have multiple jobs which uses distributed cache with small size of 
> files, the directory limit reaches before reaching the cache size and fails 
> to create any directories in file cache (PUBLIC). The jobs start failing with 
> the below exception.
> java.io.IOException: mkdir of /tmp/nm-local-dir/filecache/3901886847734194975 
> failed
>   at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:909)
>   at 
> org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
>   at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
>   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:706)
>   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:703)
>   at 
> org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2325)
>   at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:703)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:147)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> we need to have a mechanism where in we can create directory hierarchy and 
> limit number of files per directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-467) Jobs fail during resource localization when public distributed-cache hits unix directory limits

2013-03-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13616995#comment-13616995
 ] 

Hadoop QA commented on YARN-467:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12576003/yarn-467-20130328.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 2 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:

  
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestContainerLocalizer

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/627//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/627//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-nodemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/627//console

This message is automatically generated.

> Jobs fail during resource localization when public distributed-cache hits 
> unix directory limits
> ---
>
> Key: YARN-467
> URL: https://issues.apache.org/jira/browse/YARN-467
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0, 2.0.0-alpha
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
> Attachments: yarn-467-20130322.1.patch, yarn-467-20130322.2.patch, 
> yarn-467-20130322.3.patch, yarn-467-20130322.patch, 
> yarn-467-20130325.1.patch, yarn-467-20130325.path, yarn-467-20130328.patch
>
>
> If we have multiple jobs which uses distributed cache with small size of 
> files, the directory limit reaches before reaching the cache size and fails 
> to create any directories in file cache (PUBLIC). The jobs start failing with 
> the below exception.
> java.io.IOException: mkdir of /tmp/nm-local-dir/filecache/3901886847734194975 
> failed
>   at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:909)
>   at 
> org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
>   at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
>   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:706)
>   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:703)
>   at 
> org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2325)
>   at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:703)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:147)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> we need to have a mechanism where in we can create directory hierarchy and 
> limit number of files per directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-467) Jobs fail during resource localization when public distributed-cache hits unix directory limits

2013-03-28 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13616558#comment-13616558
 ] 

Vinod Kumar Vavilapalli commented on YARN-467:
--

Another thing I've been looking hard is to see if 
LocalResourceTracker.localizationCompleted() can be done away with completely 
in favour of the handle() method. But to do that we need to handle both 
successful and failing localizations via handle(). I can already see a couple 
of bugs related to localization failures, so let's do this separately.

> Jobs fail during resource localization when public distributed-cache hits 
> unix directory limits
> ---
>
> Key: YARN-467
> URL: https://issues.apache.org/jira/browse/YARN-467
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0, 2.0.0-alpha
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
> Attachments: yarn-467-20130322.1.patch, yarn-467-20130322.2.patch, 
> yarn-467-20130322.3.patch, yarn-467-20130322.patch, 
> yarn-467-20130325.1.patch, yarn-467-20130325.path
>
>
> If we have multiple jobs which uses distributed cache with small size of 
> files, the directory limit reaches before reaching the cache size and fails 
> to create any directories in file cache (PUBLIC). The jobs start failing with 
> the below exception.
> java.io.IOException: mkdir of /tmp/nm-local-dir/filecache/3901886847734194975 
> failed
>   at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:909)
>   at 
> org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
>   at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
>   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:706)
>   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:703)
>   at 
> org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2325)
>   at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:703)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:147)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> we need to have a mechanism where in we can create directory hierarchy and 
> limit number of files per directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-467) Jobs fail during resource localization when public distributed-cache hits unix directory limits

2013-03-27 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13615908#comment-13615908
 ] 

Omkar Vinit Joshi commented on YARN-467:


Adding tests to validate the expected behavior :-

* TestHierarchicalDirectory
** testHierarchicalSubDirectoryCreation :- It tests below scenarios
*** Limiting files per directory to 
YarnConfiguration.NM_LOCAL_CACHE_NUM_FILES_PER_DIRECTORY ( which includes 36 
directories)
*** If a file is removed (decFileCountForPath call) from any subdirectory then 
those directories are reused the order in which their state changes to 
DirectoryState.VACANT
*** Checks path generation upto 2nd level.
** testMinimumPerDirectoryFileLimit :- This tests if the configuration 
parameter is set to a value which is <= 36.

* TestLocalResourcesTrackerImpl
** testMinimumPerDirectoryFileLimit :- It is testing Public resources for 
HierarchicalDirectory structure.

> Jobs fail during resource localization when public distributed-cache hits 
> unix directory limits
> ---
>
> Key: YARN-467
> URL: https://issues.apache.org/jira/browse/YARN-467
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0, 2.0.0-alpha
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
> Attachments: yarn-467-20130322.1.patch, yarn-467-20130322.2.patch, 
> yarn-467-20130322.3.patch, yarn-467-20130322.patch, 
> yarn-467-20130325.1.patch, yarn-467-20130325.path
>
>
> If we have multiple jobs which uses distributed cache with small size of 
> files, the directory limit reaches before reaching the cache size and fails 
> to create any directories in file cache (PUBLIC). The jobs start failing with 
> the below exception.
> java.io.IOException: mkdir of /tmp/nm-local-dir/filecache/3901886847734194975 
> failed
>   at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:909)
>   at 
> org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
>   at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
>   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:706)
>   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:703)
>   at 
> org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2325)
>   at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:703)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:147)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> we need to have a mechanism where in we can create directory hierarchy and 
> limit number of files per directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-467) Jobs fail during resource localization when public distributed-cache hits unix directory limits

2013-03-27 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13615894#comment-13615894
 ] 

Omkar Vinit Joshi commented on YARN-467:


The Underlying problem here is that ResourceLocalization is trying to localize 
files more than the allowed file limit per directory for the underlying local 
file system.

Proposed Solution :- ( For Public resources - localized under :- 
/filecache/ )

We are going to maintain hierarchical directory structure inside the local 
directories for filecache.
so the directory structure will look like this

.../filecache/
.../filecache/<36 directories (0-9 & a-z)>/
.../filecache/<36 directories (0-9 & a-z)>/<36 directories (0-9 & a-z)>
.

So in all every directory will have (8192-36) localized files and 36 sub 
directories named 0-9 and a-z. These sub directories are created only if they 
are required. They will not be created in advance. Likewise every sub directory 
will have similar structure.

Now to manage files and to limit the number of files per directory to 
HierarchicalDirectory#PER_DIR_FILE_LIMIT (in this case 8192) introducing below 
classes / implementation.

* LocalResourcesTrackerImpl :-
** maintainHierarchicalDir  :- a boolean flag. It should be set when you want 
to use this resource tracker to track resources with hierarchical directory 
structure.
** directoryMap :- Map of . It makes sure that we 
have one HierarchicalDirectory for every localPath. ( For example if we have 
two local-dirs configured then it will have 2 entries.)
** inProgressRsrcMap :- Map of . This is used while 
local resource is getting localized. This map helps in two ways
*** If the resource localization fails for that resource then we can retrieve 
the path and remove the file reservation (file count)
*** If the LocalResourceRequest comes again for the same resourcerequest ( 
which is highly unlikely for today's implementation) it can return the same 
path back.
** getPathForLocalResource :- This method should be called to retrieve the 
Hierarchical directory path for the local-dir identified by the localDirPath. 
Internally it adds this request and returned path to inProgressRsrcMap and 
makes a reservation into the HierarchicalDirectory tracking this local-dir-path.
** decFileCountForHierarchicalPath :- It retrieves the localizedPath from 
either inProgressRsrcMap or from LocalizedResource and then reduces file count 
for the HierarchicalDirectory tracking it.
** localizationCompleted :- (Parameter - success) If true then it will only 
update inProgressRsrcMap; otherwise it will update inProgressRsrcMap and will 
also call decFileCountForHierarchicalPath.

* HierarchicalDirectory :- It just helps in managing hierarchical directories.
** PER_DIR_FILE_LIMIT :- It controls the files per directory /sub directories 
of it. Can be controlled but should not be set too low 
(YarnConfiguration.NM_LOCAL_CACHE_NUM_FILES_PER_DIRECTORY).
** DIRECTORIES_PER_LEVEL (constant 36) :- So every directory/sub-directory will 
have total 36 directories only if they are required. ( 0-9 and a-z). Reason 
behind using single character is the file length limit for windows.
** vacantSubDirectories :- Queue :- at the beginning 
this will have root of the HierarchicalDirectory as the only sub directory. if 
the queue becomes empty then new sub directory will be created starting with 0. 
Note:- It will only create internal tracking for this and doesn't create an 
actual directory on file system.
** knownSubDirectories :- Map of  - Root 
directory is identified by an empty string "" and then other sub directories by 
their relative paths. like for directory 0:"0" for 0/a :"0/a"
** getHierarchicalPath :- (synchronized) This method returns the relative path 
for the sub directory which is empty (has not reached its directory file 
limit). If no empty sub directory is present then it will create one using 
totalSubDirectories.
** decFileCountForPath :- (synchronized) This method reduces the count for the 
HierarchicalSubDirectory representing the passed in relative path.


> Jobs fail during resource localization when public distributed-cache hits 
> unix directory limits
> ---
>
> Key: YARN-467
> URL: https://issues.apache.org/jira/browse/YARN-467
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0, 2.0.0-alpha
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
> Attachments: yarn-467-20130322.1.patch, yarn-467-20130322.2.patch, 
> yarn-467-20130322.3.patch, yarn-467-20130322.patch, 
> yarn-467-20130325.1.patch, yarn-467-20130325.path
>
>
> If we have multiple jobs which uses distributed cache with small size of 
> files, the directo