[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Chen updated MAPREDUCE-1568:
----------------------------------

        Summary: TrackerDistributedCacheManager should clean up cache in a 
background thread  (was: TrackerDistributedCacheManager should do 
deleteLocalPath asynchronously)
    Description: 
Right now the TrackerDistributedCacheManager do the clean up with the following 
code path:
{code}
TaskRunner.run() -> TrackerDistributedCacheManager.setup() -> 
TrackerDistributedCacheManager.getLocalCache() -> 
TrackerDistributedCacheManager.deleteCache()
{/code}
The deletion of the cache files can take a long time and it should not be done 
by a task. We suggest that there should be a separate thread checking and clean 
up the cache files.

  was:
TrackerDistributedCacheManager.deleteCache() has been improved:
MAPREDUCE-1302 makes TrackerDistributedCacheManager rename the caches in the 
main thread and then delete them in the background 
MAPREDUCE-1098 avoids global locking while do the renaming (renaming lots of 
directories can also takes a long time)

But the deleteLocalCache is still in the main thread of TaskRunner.run(). So it 
will still slow down the task which triggers the deletion (originally this will 
blocks all tasks, but it is fixed by MAPREDUCE-1098). Other tasks do not wait 
for the deletion. The task which triggers the deletion should not wait for this 
either. TrackerDistributedCacheManager should do deleteLocalPath() 
asynchronously.



I have changed the title and description of this JIRA to fit our current idea.

> TrackerDistributedCacheManager should clean up cache in a background thread
> ---------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1568
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1568
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Scott Chen
>            Assignee: Scott Chen
>             Fix For: 0.22.0
>
>         Attachments: MAPREDUCE-1568-v2.txt, MAPREDUCE-1568.txt
>
>
> Right now the TrackerDistributedCacheManager do the clean up with the 
> following code path:
> {code}
> TaskRunner.run() -> TrackerDistributedCacheManager.setup() -> 
> TrackerDistributedCacheManager.getLocalCache() -> 
> TrackerDistributedCacheManager.deleteCache()
> {/code}
> The deletion of the cache files can take a long time and it should not be 
> done by a task. We suggest that there should be a separate thread checking 
> and clean up the cache files.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to