Throttle the deletion of data from the distributed cache
--------------------------------------------------------
Key: MAPREDUCE-2572
URL: https://issues.apache.org/jira/browse/MAPREDUCE-2572
Project: Hadoop Map/Reduce
Issue Type: Improvement
Components: distributed-cache
Affects Versions: 0.20.205.0
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
When deleting entries from the distributed cache we do so in a background
thread. Once the size limit of the distributed cache is reached all unused
entries are deleted. MAPREDUCE-2494 changes this so that entries are deleted
in LRU order until the usage falls below a given threshold. In either of these
cases we are periodically flooding a disk with delete requests which can slow
down all IO operations to a drive. It would be better to be able to throttle
this deletion so that it is spread out over a longer period of time. This jira
is to add in this throttling.
On investigating it seems much simpler to backport MPAREDUCE-2494 to 20S before
implementing this change rather then try to implement it without LRU deletion,
because LRU goes a long way towards reducing the load on the disk anyways.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira