[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15151528#comment-15151528
 ] 

Karthik Kambatla commented on MAPREDUCE-6622:
---------------------------------------------

Sorry for the delay in getting to this. Took a quick look at the patch, some 
high-level comments:
# Since this is the JobHistoryServer and no client code runs in it, I am 
comfortable with us using guava cache. I am aware of Ray's attempts at doing 
the same using the previous LinkedHashMap and it was fairly involved. I ll let 
him comment on the details here. 
# Since the cache automatically evicts jobs when it loads a new job, I don't 
see the need for the cleanup thread. Having an additional cleanup thread kind 
of goes against the reason we are using guava cache in the first place, 
simplicity. We could choose to use time-based eviction if we want, there is 
also a "ticker" we could use to test it. 
# Is okay to not honor {{loadedjobs.cache.size}} when 
{{loadedtasks.cache.size}} is specified? [~jlowe] - do you remember if the 
number of jobs was being used as a proxy for the memory usage? 
# What happens when a job with tasks more than the allowed value needs to be 
loaded? Can we add a test to verify that case? 

> Add capability to set JHS job cache to a task-based limit
> ---------------------------------------------------------
>
>                 Key: MAPREDUCE-6622
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: jobhistoryserver
>    Affects Versions: 2.7.2
>            Reporter: Ray Chiang
>            Assignee: Ray Chiang
>              Labels: supportability
>         Attachments: MAPREDUCE-6622.001.patch, MAPREDUCE-6622.002.patch
>
>
> When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs 
> can be of varying size.  This is generally not a problem when the jobs sizes 
> are uniform or small, but when the job sizes can be very large (say greater 
> than 250k tasks), then the JHS heap size can grow tremendously.
> In cases, where multiple jobs are very large, then the JHS can lock up and 
> spend all its time in GC.  However, since the cache is holding on to all the 
> jobs, not much heap space can be freed up.
> By setting a property that sets a cap on the number of tasks allowed in the 
> cache and since the total number of tasks loaded is directly proportional to 
> the amount of heap used, this should help prevent the JHS from locking up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to