[
https://issues.apache.org/jira/browse/MAPREDUCE-2572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13070529#comment-13070529
]
Robert Joseph Evans commented on MAPREDUCE-2572:
------------------------------------------------
I have updated the patch in MAPREDUCE-2494 for 0.20.205 to have 0.95 as the
default, so I will only submit a patch for trunk and 0.22.
> Throttle the deletion of data from the distributed cache
> --------------------------------------------------------
>
> Key: MAPREDUCE-2572
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2572
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: distributed-cache
> Affects Versions: 0.20.205.0
> Reporter: Robert Joseph Evans
> Assignee: Robert Joseph Evans
> Attachments: THROTTLING-security-v1.patch
>
>
> When deleting entries from the distributed cache we do so in a background
> thread. Once the size limit of the distributed cache is reached all unused
> entries are deleted. MAPREDUCE-2494 changes this so that entries are deleted
> in LRU order until the usage falls below a given threshold. In either of
> these cases we are periodically flooding a disk with delete requests which
> can slow down all IO operations to a drive. It would be better to be able to
> throttle this deletion so that it is spread out over a longer period of time.
> This jira is to add in this throttling.
> On investigating it seems much simpler to backport MPAREDUCE-2494 to 20S
> before implementing this change rather then try to implement it without LRU
> deletion, because LRU goes a long way towards reducing the load on the disk
> anyways.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira