[ https://issues.apache.org/jira/browse/MAPREDUCE-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12860350#action_12860350 ]
Scott Chen commented on MAPREDUCE-1568: --------------------------------------- Hey Amareshwari, deleteCache will first get the global lock of all cache and put the one needs with zero reference count in toBeDeleted (this is done by you guys in MAPREDUCE-1098). And the asynchronous deletion will start from there. When the deletion condition is valid, only one task will get the global lock and after it comes out of the global lock the deletion condition will no longer valid. So there cannot be two threads deleting same set of cache at the same moment. {code} private void deleteCache(Configuration conf) throws IOException { Collection<CacheStatus> toBeDeleted = new LinkedList<CacheStatus>(); synchronized (cachedArchives) { // Global lock of all caches // Find cache Status with refcount of zero and put them in to toBeDeleted } // do the deletion asynchronously, after releasing the global lock ... cacheFileCleaner.start(); } {code} A separate cleanup thread is another option. I think that will work fine as well. But that will require more change. I think the good thing about the current patch is that it is simple and safe. > TrackerDistributedCacheManager should do deleteLocalPath asynchronously > ----------------------------------------------------------------------- > > Key: MAPREDUCE-1568 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1568 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Affects Versions: 0.22.0 > Reporter: Scott Chen > Assignee: Scott Chen > Fix For: 0.22.0 > > Attachments: MAPREDUCE-1568.txt > > > TrackerDistributedCacheManager.deleteCache() has been improved: > MAPREDUCE-1302 makes TrackerDistributedCacheManager rename the caches in the > main thread and then delete them in the background > MAPREDUCE-1098 avoids global locking while do the renaming (renaming lots of > directories can also takes a long time) > But the deleteLocalCache is still in the main thread of TaskRunner.run(). So > it will still slow down the task which triggers the deletion (originally this > will blocks all tasks, but it is fixed by MAPREDUCE-1098). Other tasks do not > wait for the deletion. The task which triggers the deletion should not wait > for this either. TrackerDistributedCacheManager should do deleteLocalPath() > asynchronously. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.