Thanks Harsh. My issue was not related to the number of files/folders but related to the total size of the DistributedCache. The directory where it's stored only has 7GB available... So I will setup the limit to 5GB with local.cache.size, or move it to the drives there I have the dfs files stored.
Thanks, JM 2013/3/28 Harsh J <[email protected]>: > The DistributedCache is cleaned automatically and no user intervention > (aside of size limitation changes, which may be an administrative > requirement) is generally required to delete the older distributed > cache files. > > This is observable in code and is also noted in TDG, 2ed.: > > Tom White: > """ > The tasktracker also maintains a reference count for the number of > tasks using each file in the cache. Before the task has run, the > file’s reference count is incremented by one; then after the task has > run, the count is decreased by one. Only when the count reaches zero > it is eligible for deletion, since no tasks are using it. Files are > deleted to make room for a new file when the cache exceeds a certain > size—10 GB by default. The cache size may be changed by setting the > configuration property local.cache.size, which is measured in bytes. > """ > > And also, the maximum allowed dirs is also checked for automatically > today, to not violate the OS's limits. > > On Wed, Mar 27, 2013 at 7:07 PM, Jean-Marc Spaggiari > <[email protected]> wrote: >> Oh! good to know! It keep tracks even of month old entries??? There is no >> TTL? >> >> I was not able to find the documentation for local.cache.size or >> mapreduce.tasktracker.cache.local.size in 1.0.x branch. Do you know >> where I can found that? >> >> Thanks, >> >> JM >> >> 2013/3/27 Koji Noguchi <[email protected]>: >>>> Else, I will go for a customed script to delete all directories (and >>>> content) older than 2 or 3 days… >>>> >>> TaskTracker (or NodeManager in 2.*) keeps the list of dist cache entries in >>> memory. >>> So if external process (like your script) start deleting dist cache files, >>> there would be inconsistency and you'll start seeing task initialization >>> failures due to no file found error. >>> >>> Koji >>> >>> >>> On Mar 26, 2013, at 9:00 PM, Jean-Marc Spaggiari wrote: >>> >>>> For the situation I faced I was really a disk space issue, not related >>>> to the number of files. It was writing on a small partition. >>>> >>>> I will try with local.cache.size or >>>> mapreduce.tasktracker.cache.local.size to see if I can keep the final >>>> total size under 5GB... Else, I will go for a customed script to >>>> delete all directories (and content) older than 2 or 3 days... >>>> >>>> Thanks, >>>> >>>> JM >>>> >>>> 2013/3/26 Abdelrahman Shettia <[email protected]>: >>>>> Let me clarify , If there are lots of files or directories up to 32K ( >>>>> Depending on the user's # of files sys os config) in those distributed >>>>> cache >>>>> dirs, The OS will not be able to create any more files/dirs, Thus M-R jobs >>>>> wont get initiated on those tasktracker machines. Hope this helps. >>>>> >>>>> >>>>> Thanks >>>>> >>>>> >>>>> On Tue, Mar 26, 2013 at 1:44 PM, Vinod Kumar Vavilapalli >>>>> <[email protected]> wrote: >>>>>> >>>>>> >>>>>> All the files are not opened at the same time ever, so you shouldn't see >>>>>> any "# of open files exceeds error". >>>>>> >>>>>> Thanks, >>>>>> +Vinod Kumar Vavilapalli >>>>>> Hortonworks Inc. >>>>>> http://hortonworks.com/ >>>>>> >>>>>> On Mar 26, 2013, at 12:53 PM, Abdelrhman Shettia wrote: >>>>>> >>>>>> Hi JM , >>>>>> >>>>>> Actually these dirs need to be purged by a script that keeps the last 2 >>>>>> days worth of files, Otherwise you may run into # of open files exceeds >>>>>> error. >>>>>> >>>>>> Thanks >>>>>> >>>>>> >>>>>> On Mar 25, 2013, at 5:16 PM, Jean-Marc Spaggiari >>>>>> <[email protected]> >>>>>> wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> >>>>>> Each time my MR job is run, a directory is created on the TaskTracker >>>>>> >>>>>> under mapred/local/taskTracker/hadoop/distcache (based on my >>>>>> >>>>>> configuration). >>>>>> >>>>>> >>>>>> I looked at the directory today, and it's hosting thousands of >>>>>> >>>>>> directories and more than 8GB of data there. >>>>>> >>>>>> >>>>>> Is there a way to automatically delete this directory when the job is >>>>>> done? >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> >>>>>> JM >>>>>> >>>>>> >>>>>> >>>>> >>> > > > > -- > Harsh J
