[
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15123892#comment-15123892
]
Ray Chiang commented on MAPREDUCE-6622:
---------------------------------------
bq. The documentation for the existing property for limiting based on job count
needs to be updated to mention it is ignored if the new property is set.
Will do.
bq. Seems like it would be straightforward to pull out old jobs until we've
freed up enough tasks to pay for the job trying to be added to the cache. Now
we have a background thread with a new property to configure how often it
cleans – does the cache blow way out of proportion if we don't clean up fast
enough (e.g.: history server gets programmatically hammered for many jobs)?
Adding yet another guava dependency isn't appealing unless really necessary. If
we are sticking with the guava cache, is it essential to call cleanUp in a
background thread or won't this cleanup automatically happen as new jobs are
loaded into the cache?
I ended up having to call cleanUp() in order to get the unit tests to pass, but
those admittedly run in a very short amount of time. There's definitely some
lack of determinism in that size() returns an approximate size (according to
the documentation). I'd say my biggest concern is that GC without any explicit
cache churn (i.e. users clicking on job links) won't force the cache to
explicitly cleanup for when you have several really large jobs, but the
background thread would.
I could change the cache setting to mean:
- -1 Never call cleanUp() explicitly
- 0 Always call cleanUp() explicitly after each write
- >0 Run in cleanUp() in a periodic thread
And I didn't like the move to Guava either, but on the bright side, it looks
like Java 8 ConcurrentHashMap+lambdas can mimic a basic cache. Some of the
less sophisticated usages can move to that when JDK8 becomes the baseline.
> Add capability to set JHS job cache to a task-based limit
> ---------------------------------------------------------
>
> Key: MAPREDUCE-6622
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: jobhistoryserver
> Affects Versions: 2.7.2
> Reporter: Ray Chiang
> Assignee: Ray Chiang
> Labels: supportability
> Attachments: MAPREDUCE-6622.001.patch
>
>
> When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs
> can be of varying size. This is generally not a problem when the jobs sizes
> are uniform or small, but when the job sizes can be very large (say greater
> than 250k tasks), then the JHS heap size can grow tremendously.
> In cases, where multiple jobs are very large, then the JHS can lock up and
> spend all its time in GC. However, since the cache is holding on to all the
> jobs, not much heap space can be freed up.
> By setting a property that sets a cap on the number of tasks allowed in the
> cache and since the total number of tasks loaded is directly proportional to
> the amount of heap used, this should help prevent the JHS from locking up.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)