[
https://issues.apache.org/jira/browse/MAPREDUCE-2479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033083#comment-13033083
]
Robert Joseph Evans commented on MAPREDUCE-2479:
------------------------------------------------
I back ported the patch as is, but there are a few issues with the patch that I
think should be addressed, but most likely in a separate JIRA so that trunk and
security stay in sync.
The background thread will clean a directory once it goes over the size limit,
and then it will clean out all entires in the directory that are not currently
being used. It would seem more logical to have clean only a subset of the data
in an LRU manor with a goal of having the cache only X% full where X is
configurable. Also the background thread is not monitored in any way. Along
with this each individual job no longer validates that the cache has not grown
too large so if the thread dies for any reason we may fill up all disks on the
node with distributed cache. Not that it is likely to happen, but good
defensive programming would dictate that we at least monitor the thread and
restart it if it has failed.
> Backport MAPREDUCE-1568 to hadoop security branch
> -------------------------------------------------
>
> Key: MAPREDUCE-2479
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2479
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: tasktracker
> Affects Versions: 0.20.204.0
> Reporter: Robert Joseph Evans
> Assignee: Robert Joseph Evans
> Fix For: 0.20.205.0
>
> Attachments: MAPREDUCE-2479-v1.patch
>
>
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira