[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15123973#comment-15123973
 ] 

Jason Lowe commented on MAPREDUCE-6622:
---------------------------------------

My main point was that I don't think it would be hard to avoid both the 
undesirable extra Guava dependency and not-quite-deterministic behavior of the 
cache.  Seems like it would be pretty straightforward to maintain the cache 
ourselves with the LinkedHashMap to help track what hasn't been used recently.  
Did you look into that approach and abandon it?

> Add capability to set JHS job cache to a task-based limit
> ---------------------------------------------------------
>
>                 Key: MAPREDUCE-6622
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: jobhistoryserver
>    Affects Versions: 2.7.2
>            Reporter: Ray Chiang
>            Assignee: Ray Chiang
>              Labels: supportability
>         Attachments: MAPREDUCE-6622.001.patch
>
>
> When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs 
> can be of varying size.  This is generally not a problem when the jobs sizes 
> are uniform or small, but when the job sizes can be very large (say greater 
> than 250k tasks), then the JHS heap size can grow tremendously.
> In cases, where multiple jobs are very large, then the JHS can lock up and 
> spend all its time in GC.  However, since the cache is holding on to all the 
> jobs, not much heap space can be freed up.
> By setting a property that sets a cap on the number of tasks allowed in the 
> cache and since the total number of tasks loaded is directly proportional to 
> the amount of heap used, this should help prevent the JHS from locking up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to