[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-03-06 Thread Ray Chiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182650#comment-15182650
 ] 

Ray Chiang commented on MAPREDUCE-6622:
---

Thanks Zhihai!

> Add capability to set JHS job cache to a task-based limit
> -
>
> Key: MAPREDUCE-6622
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.7.2
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>Priority: Critical
>  Labels: supportability
> Fix For: 2.8.0, 2.7.3, 2.9.0, 2.6.5
>
> Attachments: MAPREDUCE-6622.001.patch, MAPREDUCE-6622.002.patch, 
> MAPREDUCE-6622.003.patch, MAPREDUCE-6622.004.patch, MAPREDUCE-6622.005.patch, 
> MAPREDUCE-6622.006.patch, MAPREDUCE-6622.007.patch, MAPREDUCE-6622.008.patch, 
> MAPREDUCE-6622.009.patch, MAPREDUCE-6622.010.patch, MAPREDUCE-6622.011.patch, 
> MAPREDUCE-6622.012.patch, MAPREDUCE-6622.013.patch, MAPREDUCE-6622.014.patch
>
>
> When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs 
> can be of varying size.  This is generally not a problem when the jobs sizes 
> are uniform or small, but when the job sizes can be very large (say greater 
> than 250k tasks), then the JHS heap size can grow tremendously.
> In cases, where multiple jobs are very large, then the JHS can lock up and 
> spend all its time in GC.  However, since the cache is holding on to all the 
> jobs, not much heap space can be freed up.
> By setting a property that sets a cap on the number of tasks allowed in the 
> cache and since the total number of tasks loaded is directly proportional to 
> the amount of heap used, this should help prevent the JHS from locking up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-03-06 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182528#comment-15182528
 ] 

zhihai xu commented on MAPREDUCE-6622:
--

Committed it to 2.8 also.

> Add capability to set JHS job cache to a task-based limit
> -
>
> Key: MAPREDUCE-6622
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.7.2
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>Priority: Critical
>  Labels: supportability
> Fix For: 2.8.0, 2.7.3, 2.9.0, 2.6.5
>
> Attachments: MAPREDUCE-6622.001.patch, MAPREDUCE-6622.002.patch, 
> MAPREDUCE-6622.003.patch, MAPREDUCE-6622.004.patch, MAPREDUCE-6622.005.patch, 
> MAPREDUCE-6622.006.patch, MAPREDUCE-6622.007.patch, MAPREDUCE-6622.008.patch, 
> MAPREDUCE-6622.009.patch, MAPREDUCE-6622.010.patch, MAPREDUCE-6622.011.patch, 
> MAPREDUCE-6622.012.patch, MAPREDUCE-6622.013.patch, MAPREDUCE-6622.014.patch
>
>
> When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs 
> can be of varying size.  This is generally not a problem when the jobs sizes 
> are uniform or small, but when the job sizes can be very large (say greater 
> than 250k tasks), then the JHS heap size can grow tremendously.
> In cases, where multiple jobs are very large, then the JHS can lock up and 
> spend all its time in GC.  However, since the cache is holding on to all the 
> jobs, not much heap space can be freed up.
> By setting a property that sets a cap on the number of tasks allowed in the 
> cache and since the total number of tasks loaded is directly proportional to 
> the amount of heap used, this should help prevent the JHS from locking up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-03-06 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182518#comment-15182518
 ] 

zhihai xu commented on MAPREDUCE-6622:
--

Committed it to both branch 2.6 and 2.7.

> Add capability to set JHS job cache to a task-based limit
> -
>
> Key: MAPREDUCE-6622
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.7.2
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>Priority: Critical
>  Labels: supportability
> Fix For: 2.9.0
>
> Attachments: MAPREDUCE-6622.001.patch, MAPREDUCE-6622.002.patch, 
> MAPREDUCE-6622.003.patch, MAPREDUCE-6622.004.patch, MAPREDUCE-6622.005.patch, 
> MAPREDUCE-6622.006.patch, MAPREDUCE-6622.007.patch, MAPREDUCE-6622.008.patch, 
> MAPREDUCE-6622.009.patch, MAPREDUCE-6622.010.patch, MAPREDUCE-6622.011.patch, 
> MAPREDUCE-6622.012.patch, MAPREDUCE-6622.013.patch, MAPREDUCE-6622.014.patch
>
>
> When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs 
> can be of varying size.  This is generally not a problem when the jobs sizes 
> are uniform or small, but when the job sizes can be very large (say greater 
> than 250k tasks), then the JHS heap size can grow tremendously.
> In cases, where multiple jobs are very large, then the JHS can lock up and 
> spend all its time in GC.  However, since the cache is holding on to all the 
> jobs, not much heap space can be freed up.
> By setting a property that sets a cap on the number of tasks allowed in the 
> cache and since the total number of tasks loaded is directly proportional to 
> the amount of heap used, this should help prevent the JHS from locking up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-03-03 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15179080#comment-15179080
 ] 

zhihai xu commented on MAPREDUCE-6622:
--

Thanks for the confirmation [~rchiang]! I will back port the patch to 2.6.5 and 
2.7.3 branch after waiting for several days if no one objects.

> Add capability to set JHS job cache to a task-based limit
> -
>
> Key: MAPREDUCE-6622
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.7.2
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>Priority: Critical
>  Labels: supportability
> Fix For: 2.9.0
>
> Attachments: MAPREDUCE-6622.001.patch, MAPREDUCE-6622.002.patch, 
> MAPREDUCE-6622.003.patch, MAPREDUCE-6622.004.patch, MAPREDUCE-6622.005.patch, 
> MAPREDUCE-6622.006.patch, MAPREDUCE-6622.007.patch, MAPREDUCE-6622.008.patch, 
> MAPREDUCE-6622.009.patch, MAPREDUCE-6622.010.patch, MAPREDUCE-6622.011.patch, 
> MAPREDUCE-6622.012.patch, MAPREDUCE-6622.013.patch, MAPREDUCE-6622.014.patch
>
>
> When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs 
> can be of varying size.  This is generally not a problem when the jobs sizes 
> are uniform or small, but when the job sizes can be very large (say greater 
> than 250k tasks), then the JHS heap size can grow tremendously.
> In cases, where multiple jobs are very large, then the JHS can lock up and 
> spend all its time in GC.  However, since the cache is holding on to all the 
> jobs, not much heap space can be freed up.
> By setting a property that sets a cap on the number of tasks allowed in the 
> cache and since the total number of tasks loaded is directly proportional to 
> the amount of heap used, this should help prevent the JHS from locking up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-02-29 Thread Ray Chiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15172155#comment-15172155
 ] 

Ray Chiang commented on MAPREDUCE-6622:
---

Good observation, Zhihai.

Since it's an optional setting that's off by default, I'd be fine with adding 
it to the 2.6/2.7 line.


> Add capability to set JHS job cache to a task-based limit
> -
>
> Key: MAPREDUCE-6622
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.7.2
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>Priority: Critical
>  Labels: supportability
> Fix For: 2.9.0
>
> Attachments: MAPREDUCE-6622.001.patch, MAPREDUCE-6622.002.patch, 
> MAPREDUCE-6622.003.patch, MAPREDUCE-6622.004.patch, MAPREDUCE-6622.005.patch, 
> MAPREDUCE-6622.006.patch, MAPREDUCE-6622.007.patch, MAPREDUCE-6622.008.patch, 
> MAPREDUCE-6622.009.patch, MAPREDUCE-6622.010.patch, MAPREDUCE-6622.011.patch, 
> MAPREDUCE-6622.012.patch, MAPREDUCE-6622.013.patch, MAPREDUCE-6622.014.patch
>
>
> When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs 
> can be of varying size.  This is generally not a problem when the jobs sizes 
> are uniform or small, but when the job sizes can be very large (say greater 
> than 250k tasks), then the JHS heap size can grow tremendously.
> In cases, where multiple jobs are very large, then the JHS can lock up and 
> spend all its time in GC.  However, since the cache is holding on to all the 
> jobs, not much heap space can be freed up.
> By setting a property that sets a cap on the number of tasks allowed in the 
> cache and since the total number of tasks loaded is directly proportional to 
> the amount of heap used, this should help prevent the JHS from locking up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-02-27 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15170516#comment-15170516
 ] 

zhihai xu commented on MAPREDUCE-6622:
--

This patch also fixed a memory leak issue due to a race condition at 
{{CachedHistoryStorage.getFullJob}}, We can reproduce this memory leak issue by 
keeping refreshing the JHS web page for a job with more than 40,000 mappers 
quickly. The race condition is {{fileInfo.loadJob()}} takes long time to load 
the job with more than 4 mappers, during that time, {{fileInfo.loadJob()}} 
is called multiple times for the same job because no synchronization between 
{{loadedJobCache.get(jobId)}} and {{loadJob(fileInfo)}}. You will see the used 
heap memory quickly go up. Looked at the heap dump, we find 56 {{CompletedJob}} 
instances for the same job ID, which have total more 2 million 
mappers(56*4). Based on the link from 
http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/cache/CacheBuilder.html#build(com.google.common.cache.CacheLoader)
This won't be an issue for com.google.common.cache.LoadingCache:
{code}
If another thread is currently loading the value for this key, simply waits for 
that thread to finish and returns its loaded value
{code}
This looks like a critical issue for me. Should we backport this patch to 2.7.3 
and 2.6.5 branch?


> Add capability to set JHS job cache to a task-based limit
> -
>
> Key: MAPREDUCE-6622
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.7.2
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: supportability
> Fix For: 2.9.0
>
> Attachments: MAPREDUCE-6622.001.patch, MAPREDUCE-6622.002.patch, 
> MAPREDUCE-6622.003.patch, MAPREDUCE-6622.004.patch, MAPREDUCE-6622.005.patch, 
> MAPREDUCE-6622.006.patch, MAPREDUCE-6622.007.patch, MAPREDUCE-6622.008.patch, 
> MAPREDUCE-6622.009.patch, MAPREDUCE-6622.010.patch, MAPREDUCE-6622.011.patch, 
> MAPREDUCE-6622.012.patch, MAPREDUCE-6622.013.patch, MAPREDUCE-6622.014.patch
>
>
> When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs 
> can be of varying size.  This is generally not a problem when the jobs sizes 
> are uniform or small, but when the job sizes can be very large (say greater 
> than 250k tasks), then the JHS heap size can grow tremendously.
> In cases, where multiple jobs are very large, then the JHS can lock up and 
> spend all its time in GC.  However, since the cache is holding on to all the 
> jobs, not much heap space can be freed up.
> By setting a property that sets a cap on the number of tasks allowed in the 
> cache and since the total number of tasks loaded is directly proportional to 
> the amount of heap used, this should help prevent the JHS from locking up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-02-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15170289#comment-15170289
 ] 

Hudson commented on MAPREDUCE-6622:
---

FAILURE: Integrated in Hadoop-trunk-Commit #9380 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9380/])
MAPREDUCE-6622. Add capability to set JHS job cache to a task-based (rkanter: 
rev 0f72da7e281376f4fcbfbf3fb33f5d7fedcdb1aa)
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/jobhistory/JHAdminConfig.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/TestJobHistory.java
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/CachedHistoryStorage.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml


> Add capability to set JHS job cache to a task-based limit
> -
>
> Key: MAPREDUCE-6622
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.7.2
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: supportability
> Fix For: 2.9.0
>
> Attachments: MAPREDUCE-6622.001.patch, MAPREDUCE-6622.002.patch, 
> MAPREDUCE-6622.003.patch, MAPREDUCE-6622.004.patch, MAPREDUCE-6622.005.patch, 
> MAPREDUCE-6622.006.patch, MAPREDUCE-6622.007.patch, MAPREDUCE-6622.008.patch, 
> MAPREDUCE-6622.009.patch, MAPREDUCE-6622.010.patch, MAPREDUCE-6622.011.patch, 
> MAPREDUCE-6622.012.patch, MAPREDUCE-6622.013.patch, MAPREDUCE-6622.014.patch
>
>
> When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs 
> can be of varying size.  This is generally not a problem when the jobs sizes 
> are uniform or small, but when the job sizes can be very large (say greater 
> than 250k tasks), then the JHS heap size can grow tremendously.
> In cases, where multiple jobs are very large, then the JHS can lock up and 
> spend all its time in GC.  However, since the cache is holding on to all the 
> jobs, not much heap space can be freed up.
> By setting a property that sets a cap on the number of tasks allowed in the 
> cache and since the total number of tasks loaded is directly proportional to 
> the amount of heap used, this should help prevent the JHS from locking up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-02-26 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15170271#comment-15170271
 ] 

Robert Kanter commented on MAPREDUCE-6622:
--

+1

Will commit this shortly.  The findbugs and checkstyle are very trivial.  I'll 
just fix those during commit so we don't have to go through another iteration 
for just them.

> Add capability to set JHS job cache to a task-based limit
> -
>
> Key: MAPREDUCE-6622
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.7.2
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: supportability
> Attachments: MAPREDUCE-6622.001.patch, MAPREDUCE-6622.002.patch, 
> MAPREDUCE-6622.003.patch, MAPREDUCE-6622.004.patch, MAPREDUCE-6622.005.patch, 
> MAPREDUCE-6622.006.patch, MAPREDUCE-6622.007.patch, MAPREDUCE-6622.008.patch, 
> MAPREDUCE-6622.009.patch, MAPREDUCE-6622.010.patch, MAPREDUCE-6622.011.patch, 
> MAPREDUCE-6622.012.patch, MAPREDUCE-6622.013.patch
>
>
> When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs 
> can be of varying size.  This is generally not a problem when the jobs sizes 
> are uniform or small, but when the job sizes can be very large (say greater 
> than 250k tasks), then the JHS heap size can grow tremendously.
> In cases, where multiple jobs are very large, then the JHS can lock up and 
> spend all its time in GC.  However, since the cache is holding on to all the 
> jobs, not much heap space can be freed up.
> By setting a property that sets a cap on the number of tasks allowed in the 
> cache and since the total number of tasks loaded is directly proportional to 
> the amount of heap used, this should help prevent the JHS from locking up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-02-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15170250#comment-15170250
 ] 

Hadoop QA commented on MAPREDUCE-6622:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 12m 19s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
38s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 26s 
{color} | {color:green} trunk passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 40s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
23s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 17s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
38s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
23s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s 
{color} | {color:green} trunk passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
4s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 25s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 25s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 38s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 38s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 21s 
{color} | {color:red} hadoop-mapreduce-project/hadoop-mapreduce-client: patch 
generated 1 new + 47 unchanged - 0 fixed = 48 total (was 47) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 13s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
33s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 44s 
{color} | {color:red} 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs 
generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 52s 
{color} | {color:green} hadoop-mapreduce-client-core in the patch passed with 
JDK v1.8.0_72. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 38s 
{color} | {color:green} hadoop-mapreduce-client-common in the patch passed with 
JDK v1.8.0_72. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 5m 33s 
{color} | {color:green} hadoop-mapreduce-client-hs in the patch passed with JDK 
v1.8.0_72. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 14s 
{color} | 

[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-02-26 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15170129#comment-15170129
 ] 

Karthik Kambatla commented on MAPREDUCE-6622:
-

Latest patch looks good to me. +1. Thanks for your patience through the 
reviews, [~rchiang]. 

> Add capability to set JHS job cache to a task-based limit
> -
>
> Key: MAPREDUCE-6622
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.7.2
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: supportability
> Attachments: MAPREDUCE-6622.001.patch, MAPREDUCE-6622.002.patch, 
> MAPREDUCE-6622.003.patch, MAPREDUCE-6622.004.patch, MAPREDUCE-6622.005.patch, 
> MAPREDUCE-6622.006.patch, MAPREDUCE-6622.007.patch, MAPREDUCE-6622.008.patch, 
> MAPREDUCE-6622.009.patch, MAPREDUCE-6622.010.patch, MAPREDUCE-6622.011.patch, 
> MAPREDUCE-6622.012.patch, MAPREDUCE-6622.013.patch
>
>
> When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs 
> can be of varying size.  This is generally not a problem when the jobs sizes 
> are uniform or small, but when the job sizes can be very large (say greater 
> than 250k tasks), then the JHS heap size can grow tremendously.
> In cases, where multiple jobs are very large, then the JHS can lock up and 
> spend all its time in GC.  However, since the cache is holding on to all the 
> jobs, not much heap space can be freed up.
> By setting a property that sets a cap on the number of tasks allowed in the 
> cache and since the total number of tasks loaded is directly proportional to 
> the amount of heap used, this should help prevent the JHS from locking up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-02-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15170105#comment-15170105
 ] 

Hadoop QA commented on MAPREDUCE-6622:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red} 2m 19s 
{color} | {color:red} Docker failed to build yetus/hadoop:0ca8df7. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12790250/MAPREDUCE-6622.012.patch
 |
| JIRA Issue | MAPREDUCE-6622 |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6352/console |
| Powered by | Apache Yetus 0.2.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Add capability to set JHS job cache to a task-based limit
> -
>
> Key: MAPREDUCE-6622
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.7.2
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: supportability
> Attachments: MAPREDUCE-6622.001.patch, MAPREDUCE-6622.002.patch, 
> MAPREDUCE-6622.003.patch, MAPREDUCE-6622.004.patch, MAPREDUCE-6622.005.patch, 
> MAPREDUCE-6622.006.patch, MAPREDUCE-6622.007.patch, MAPREDUCE-6622.008.patch, 
> MAPREDUCE-6622.009.patch, MAPREDUCE-6622.010.patch, MAPREDUCE-6622.011.patch, 
> MAPREDUCE-6622.012.patch
>
>
> When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs 
> can be of varying size.  This is generally not a problem when the jobs sizes 
> are uniform or small, but when the job sizes can be very large (say greater 
> than 250k tasks), then the JHS heap size can grow tremendously.
> In cases, where multiple jobs are very large, then the JHS can lock up and 
> spend all its time in GC.  However, since the cache is holding on to all the 
> jobs, not much heap space can be freed up.
> By setting a property that sets a cap on the number of tasks allowed in the 
> cache and since the total number of tasks loaded is directly proportional to 
> the amount of heap used, this should help prevent the JHS from locking up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-02-26 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15170093#comment-15170093
 ] 

Karthik Kambatla commented on MAPREDUCE-6622:
-

The latest patch does fix the behavior, nice. 

Minor nits:
# I suppose the reason we are setting the message for HSFileException is to log 
it. However, the code doesn't log anything. We should do one of the two: 
## Log before throwing HSFileException and don't set the message.
## Log when we catch HSFileException.
# The test uses wildcard imports - let us avoid that.


> Add capability to set JHS job cache to a task-based limit
> -
>
> Key: MAPREDUCE-6622
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.7.2
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: supportability
> Attachments: MAPREDUCE-6622.001.patch, MAPREDUCE-6622.002.patch, 
> MAPREDUCE-6622.003.patch, MAPREDUCE-6622.004.patch, MAPREDUCE-6622.005.patch, 
> MAPREDUCE-6622.006.patch, MAPREDUCE-6622.007.patch, MAPREDUCE-6622.008.patch, 
> MAPREDUCE-6622.009.patch, MAPREDUCE-6622.010.patch, MAPREDUCE-6622.011.patch, 
> MAPREDUCE-6622.012.patch
>
>
> When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs 
> can be of varying size.  This is generally not a problem when the jobs sizes 
> are uniform or small, but when the job sizes can be very large (say greater 
> than 250k tasks), then the JHS heap size can grow tremendously.
> In cases, where multiple jobs are very large, then the JHS can lock up and 
> spend all its time in GC.  However, since the cache is holding on to all the 
> jobs, not much heap space can be freed up.
> By setting a property that sets a cap on the number of tasks allowed in the 
> cache and since the total number of tasks loaded is directly proportional to 
> the amount of heap used, this should help prevent the JHS from locking up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-02-25 Thread Ray Chiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168563#comment-15168563
 ] 

Ray Chiang commented on MAPREDUCE-6622:
---

Ugh, never mind.  Had two windows opened and one not refreshed.

> Add capability to set JHS job cache to a task-based limit
> -
>
> Key: MAPREDUCE-6622
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.7.2
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: supportability
> Attachments: MAPREDUCE-6622.001.patch, MAPREDUCE-6622.002.patch, 
> MAPREDUCE-6622.003.patch, MAPREDUCE-6622.004.patch, MAPREDUCE-6622.005.patch, 
> MAPREDUCE-6622.006.patch, MAPREDUCE-6622.007.patch, MAPREDUCE-6622.008.patch, 
> MAPREDUCE-6622.009.patch, MAPREDUCE-6622.010.patch, MAPREDUCE-6622.011.patch
>
>
> When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs 
> can be of varying size.  This is generally not a problem when the jobs sizes 
> are uniform or small, but when the job sizes can be very large (say greater 
> than 250k tasks), then the JHS heap size can grow tremendously.
> In cases, where multiple jobs are very large, then the JHS can lock up and 
> spend all its time in GC.  However, since the cache is holding on to all the 
> jobs, not much heap space can be freed up.
> By setting a property that sets a cap on the number of tasks allowed in the 
> cache and since the total number of tasks loaded is directly proportional to 
> the amount of heap used, this should help prevent the JHS from locking up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-02-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168408#comment-15168408
 ] 

Hadoop QA commented on MAPREDUCE-6622:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 39s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
23s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 40s 
{color} | {color:green} trunk passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 53s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
25s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 24s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
41s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
33s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s 
{color} | {color:green} trunk passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
8s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 41s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 41s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 48s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 48s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
23s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 20s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
34s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 7s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 2m 6s {color} | 
{color:red} hadoop-mapreduce-client-core in the patch failed with JDK 
v1.8.0_72. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 43s 
{color} | {color:green} hadoop-mapreduce-client-common in the patch passed with 
JDK v1.8.0_72. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 5m 55s 
{color} | {color:green} hadoop-mapreduce-client-hs in the patch passed with JDK 
v1.8.0_72. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 25s 
{color} | {color:green} hadoop-mapreduce-client-core in the patch passed with 
JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 50s 
{color} | {color:green} 

[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-02-25 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168216#comment-15168216
 ] 

Karthik Kambatla commented on MAPREDUCE-6622:
-

Almost there. One comment:
# Should we use a more specific Exception in loadJob(). We don't distinguish 
between IOException due to failure to read (fileInfo.loadJob) and the 
null/deleted file. We should distinguish those so we can throw 
YanrRuntimeException if IOException.

> Add capability to set JHS job cache to a task-based limit
> -
>
> Key: MAPREDUCE-6622
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.7.2
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: supportability
> Attachments: MAPREDUCE-6622.001.patch, MAPREDUCE-6622.002.patch, 
> MAPREDUCE-6622.003.patch, MAPREDUCE-6622.004.patch, MAPREDUCE-6622.005.patch, 
> MAPREDUCE-6622.006.patch, MAPREDUCE-6622.007.patch, MAPREDUCE-6622.008.patch, 
> MAPREDUCE-6622.009.patch, MAPREDUCE-6622.010.patch
>
>
> When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs 
> can be of varying size.  This is generally not a problem when the jobs sizes 
> are uniform or small, but when the job sizes can be very large (say greater 
> than 250k tasks), then the JHS heap size can grow tremendously.
> In cases, where multiple jobs are very large, then the JHS can lock up and 
> spend all its time in GC.  However, since the cache is holding on to all the 
> jobs, not much heap space can be freed up.
> By setting a property that sets a cap on the number of tasks allowed in the 
> cache and since the total number of tasks loaded is directly proportional to 
> the amount of heap used, this should help prevent the JHS from locking up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-02-25 Thread Ray Chiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168112#comment-15168112
 ] 

Ray Chiang commented on MAPREDUCE-6622:
---

RE: Failing unit test

MAPREDUCE-6625 has been filed for testCLI#testGetJob()

> Add capability to set JHS job cache to a task-based limit
> -
>
> Key: MAPREDUCE-6622
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.7.2
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: supportability
> Attachments: MAPREDUCE-6622.001.patch, MAPREDUCE-6622.002.patch, 
> MAPREDUCE-6622.003.patch, MAPREDUCE-6622.004.patch, MAPREDUCE-6622.005.patch, 
> MAPREDUCE-6622.006.patch, MAPREDUCE-6622.007.patch, MAPREDUCE-6622.008.patch, 
> MAPREDUCE-6622.009.patch, MAPREDUCE-6622.010.patch
>
>
> When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs 
> can be of varying size.  This is generally not a problem when the jobs sizes 
> are uniform or small, but when the job sizes can be very large (say greater 
> than 250k tasks), then the JHS heap size can grow tremendously.
> In cases, where multiple jobs are very large, then the JHS can lock up and 
> spend all its time in GC.  However, since the cache is holding on to all the 
> jobs, not much heap space can be freed up.
> By setting a property that sets a cap on the number of tasks allowed in the 
> cache and since the total number of tasks loaded is directly proportional to 
> the amount of heap used, this should help prevent the JHS from locking up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-02-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168062#comment-15168062
 ] 

Hadoop QA commented on MAPREDUCE-6622:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
51s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 31s 
{color} | {color:green} trunk passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 42s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
25s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 19s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
39s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
21s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s 
{color} | {color:green} trunk passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
4s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 28s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 28s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 41s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 41s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
23s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 15s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
33s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
48s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 55s {color} 
| {color:red} hadoop-mapreduce-client-core in the patch failed with JDK 
v1.8.0_72. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 41s 
{color} | {color:green} hadoop-mapreduce-client-common in the patch passed with 
JDK v1.8.0_72. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 5m 44s 
{color} | {color:green} hadoop-mapreduce-client-hs in the patch passed with JDK 
v1.8.0_72. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 2m 14s {color} 
| {color:red} hadoop-mapreduce-client-core in the patch failed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 47s 
{color} | {color:green} 

[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-02-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15163838#comment-15163838
 ] 

Hadoop QA commented on MAPREDUCE-6622:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
37s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 41s 
{color} | {color:green} trunk passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 48s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
26s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 26s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
41s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
36s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s 
{color} | {color:green} trunk passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 16s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
8s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 35s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 35s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 48s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 48s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
24s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 16s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
34s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 3s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 0s 
{color} | {color:green} hadoop-mapreduce-client-core in the patch passed with 
JDK v1.8.0_72. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 40s 
{color} | {color:green} hadoop-mapreduce-client-common in the patch passed with 
JDK v1.8.0_72. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 5m 50s {color} 
| {color:red} hadoop-mapreduce-client-hs in the patch failed with JDK 
v1.8.0_72. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 2m 22s {color} 
| {color:red} hadoop-mapreduce-client-core in the patch failed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 50s 
{color} | {color:green} hadoop-mapreduce-client-common 

[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-02-23 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15160149#comment-15160149
 ] 

Karthik Kambatla commented on MAPREDUCE-6622:
-

Wouldn't the current patch log an error even if a value for the new config is 
not specified. We might want to do a conf.get() followed by a 
Interger.parseInt, and a Math.min check? 

> Add capability to set JHS job cache to a task-based limit
> -
>
> Key: MAPREDUCE-6622
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.7.2
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: supportability
> Attachments: MAPREDUCE-6622.001.patch, MAPREDUCE-6622.002.patch, 
> MAPREDUCE-6622.003.patch, MAPREDUCE-6622.004.patch, MAPREDUCE-6622.005.patch, 
> MAPREDUCE-6622.006.patch, MAPREDUCE-6622.007.patch, MAPREDUCE-6622.008.patch
>
>
> When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs 
> can be of varying size.  This is generally not a problem when the jobs sizes 
> are uniform or small, but when the job sizes can be very large (say greater 
> than 250k tasks), then the JHS heap size can grow tremendously.
> In cases, where multiple jobs are very large, then the JHS can lock up and 
> spend all its time in GC.  However, since the cache is holding on to all the 
> jobs, not much heap space can be freed up.
> By setting a property that sets a cap on the number of tasks allowed in the 
> cache and since the total number of tasks loaded is directly proportional to 
> the amount of heap used, this should help prevent the JHS from locking up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-02-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15160100#comment-15160100
 ] 

Hadoop QA commented on MAPREDUCE-6622:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 40s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
46s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 29s 
{color} | {color:green} trunk passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 42s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
24s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 19s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
38s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
21s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s 
{color} | {color:green} trunk passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
3s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 26s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 2m 39s {color} 
| {color:red} hadoop-mapreduce-project_hadoop-mapreduce-client-jdk1.8.0_72 with 
JDK v1.8.0_72 generated 1 new + 360 unchanged - 0 fixed = 361 total (was 360) 
{color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 26s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 41s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 4m 20s {color} 
| {color:red} hadoop-mapreduce-project_hadoop-mapreduce-client-jdk1.7.0_95 with 
JDK v1.7.0_95 generated 1 new + 365 unchanged - 0 fixed = 366 total (was 365) 
{color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 41s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
22s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 13s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
32s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
50s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 53s 
{color} | {color:green} hadoop-mapreduce-client-core in the patch passed with 
JDK v1.8.0_72. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 39s 
{color} | {color:green} hadoop-mapreduce-client-common in the patch passed 

[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-02-23 Thread Ray Chiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159483#comment-15159483
 ] 

Ray Chiang commented on MAPREDUCE-6622:
---

In order to minimize impact, I didn't really do any real rewrite of the 
internals of CachedHistoryStorage.  I just swapped in the cache access.

Here's my understanding of getFullJob():
- It's using the HistoryFileManager "fileInfo" as a proxy to determine whether 
the job exists.
- If the job exists, it then checks the cache via loadedJobCache#getIfPresent()
- If the job isn't in the cache, it loads the file into memory
- If job history file has been deleted, getFullJob returns null


> Add capability to set JHS job cache to a task-based limit
> -
>
> Key: MAPREDUCE-6622
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.7.2
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: supportability
> Attachments: MAPREDUCE-6622.001.patch, MAPREDUCE-6622.002.patch, 
> MAPREDUCE-6622.003.patch, MAPREDUCE-6622.004.patch, MAPREDUCE-6622.005.patch
>
>
> When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs 
> can be of varying size.  This is generally not a problem when the jobs sizes 
> are uniform or small, but when the job sizes can be very large (say greater 
> than 250k tasks), then the JHS heap size can grow tremendously.
> In cases, where multiple jobs are very large, then the JHS can lock up and 
> spend all its time in GC.  However, since the cache is holding on to all the 
> jobs, not much heap space can be freed up.
> By setting a property that sets a cap on the number of tasks allowed in the 
> cache and since the total number of tasks loaded is directly proportional to 
> the amount of heap used, this should help prevent the JHS from locking up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-02-23 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158975#comment-15158975
 ] 

Jason Lowe commented on MAPREDUCE-6622:
---

Sorry for the delay in responding, as I was out on vacation.

bq.  Jason Lowe, Ray Chiang - did you guys have any other specific concerns 
with using guava that I am discounting?

My main concern was that we were adding a separate thread and extra overhead 
for something that didn't seem worth the trouble.  Now that the cache code has 
been significantly cleaned up, I'm more comfortable with going the guava route.

bq. Jason Lowe - do you remember if the number of jobs was being used as a 
proxy for the memory usage?

Yes, it was essentially a proxy for capping memory usage.  It's also a 
performance tunable, since caching more jobs can lead to improved performance 
depending upon client access patterns.  However I think it is OK to override 
the old value with the new value as the behavior is clearly documented and 
we're not providing any default value that would override the user's old 
settings implicitly when they upgrade.  Users would have to go out of their way 
to set this value and in doing so should encounter the documentation explaining 
the semantics when they set it.


> Add capability to set JHS job cache to a task-based limit
> -
>
> Key: MAPREDUCE-6622
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.7.2
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: supportability
> Attachments: MAPREDUCE-6622.001.patch, MAPREDUCE-6622.002.patch, 
> MAPREDUCE-6622.003.patch, MAPREDUCE-6622.004.patch
>
>
> When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs 
> can be of varying size.  This is generally not a problem when the jobs sizes 
> are uniform or small, but when the job sizes can be very large (say greater 
> than 250k tasks), then the JHS heap size can grow tremendously.
> In cases, where multiple jobs are very large, then the JHS can lock up and 
> spend all its time in GC.  However, since the cache is holding on to all the 
> jobs, not much heap space can be freed up.
> By setting a property that sets a cap on the number of tasks allowed in the 
> cache and since the total number of tasks loaded is directly proportional to 
> the amount of heap used, this should help prevent the JHS from locking up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-02-22 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158438#comment-15158438
 ] 

Karthik Kambatla commented on MAPREDUCE-6622:
-

Comments on the latest patch:
# Unused variable {{cleanupThread}} left over from previous iterations.
# The description for loadedjobs.cache.size config says it is ignored if 
loadedtasks.cache.size is set. A default value of -1 actually means it is set. 
I like the previous approach of not setting it at all. 
# Also, when reading the value for loadedtasks.cache.size, don't we want to 
enforce a minimum value for it? May be, 1? 
# I am not sure I understand how entries are loaded into the cache. The 
CacheBuilder calls getFullJob, but the latter checks if the entry is present in 
the cache? If I understand it right, the public getFullJob() should just call 
{{loadedJobCache.get(jobID)}}. The CacheLoader should call a private method 
{{loadJob(jobID)}}. The current contents of {{getFullJob}} - get fileInfo etc. 
- should move to loadJob. 

> Add capability to set JHS job cache to a task-based limit
> -
>
> Key: MAPREDUCE-6622
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.7.2
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: supportability
> Attachments: MAPREDUCE-6622.001.patch, MAPREDUCE-6622.002.patch, 
> MAPREDUCE-6622.003.patch, MAPREDUCE-6622.004.patch
>
>
> When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs 
> can be of varying size.  This is generally not a problem when the jobs sizes 
> are uniform or small, but when the job sizes can be very large (say greater 
> than 250k tasks), then the JHS heap size can grow tremendously.
> In cases, where multiple jobs are very large, then the JHS can lock up and 
> spend all its time in GC.  However, since the cache is holding on to all the 
> jobs, not much heap space can be freed up.
> By setting a property that sets a cap on the number of tasks allowed in the 
> cache and since the total number of tasks loaded is directly proportional to 
> the amount of heap used, this should help prevent the JHS from locking up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-02-22 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158029#comment-15158029
 ] 

Robert Kanter commented on MAPREDUCE-6622:
--

LGTM +1

[~kasha], [~jlowe], any other comments?

> Add capability to set JHS job cache to a task-based limit
> -
>
> Key: MAPREDUCE-6622
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.7.2
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: supportability
> Attachments: MAPREDUCE-6622.001.patch, MAPREDUCE-6622.002.patch, 
> MAPREDUCE-6622.003.patch, MAPREDUCE-6622.004.patch
>
>
> When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs 
> can be of varying size.  This is generally not a problem when the jobs sizes 
> are uniform or small, but when the job sizes can be very large (say greater 
> than 250k tasks), then the JHS heap size can grow tremendously.
> In cases, where multiple jobs are very large, then the JHS can lock up and 
> spend all its time in GC.  However, since the cache is holding on to all the 
> jobs, not much heap space can be freed up.
> By setting a property that sets a cap on the number of tasks allowed in the 
> cache and since the total number of tasks loaded is directly proportional to 
> the amount of heap used, this should help prevent the JHS from locking up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-02-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158010#comment-15158010
 ] 

Hadoop QA commented on MAPREDUCE-6622:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
47s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 16s 
{color} | {color:green} trunk passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 1s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
25s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 25s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
39s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
32s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 15s 
{color} | {color:green} trunk passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 10s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 13s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 13s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 0s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 0s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
24s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 19s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
34s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
16s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 26s 
{color} | {color:green} hadoop-mapreduce-client-core in the patch passed with 
JDK v1.8.0_72. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 52s 
{color} | {color:green} hadoop-mapreduce-client-common in the patch passed with 
JDK v1.8.0_72. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 6m 29s 
{color} | {color:green} hadoop-mapreduce-client-hs in the patch passed with JDK 
v1.8.0_72. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 24s 
{color} | {color:green} hadoop-mapreduce-client-core in the patch passed with 
JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 

[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-02-22 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157804#comment-15157804
 ] 

Robert Kanter commented on MAPREDUCE-6622:
--

Looks good overall, some minor things:
- The release note recommendations for 
{{mapreduce.jobhistory.loadedtasks.cache.size}} are good, but I think we should 
also put them into mapred-default.xml.  
- Undo the {{blah.blah.*}} imports
- Add a comment explaining why you set the concurrency levels to 1.  I can 
easily see someone coming along later and saying, "hey, more concurrency sounds 
good" without realizing the funny side effects that you showed me offline
- Add a comment explaining why you do the -1 in 
{{Math.min(loadedTasksCacheSize-1, value.getTotalMaps() + 
value.getTotalReduces());}} for the weights
- Use {{conf.setInt(..., #)}} instead of {{conf.set(..., "#")}} in the tests.

> Add capability to set JHS job cache to a task-based limit
> -
>
> Key: MAPREDUCE-6622
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.7.2
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: supportability
> Attachments: MAPREDUCE-6622.001.patch, MAPREDUCE-6622.002.patch, 
> MAPREDUCE-6622.003.patch
>
>
> When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs 
> can be of varying size.  This is generally not a problem when the jobs sizes 
> are uniform or small, but when the job sizes can be very large (say greater 
> than 250k tasks), then the JHS heap size can grow tremendously.
> In cases, where multiple jobs are very large, then the JHS can lock up and 
> spend all its time in GC.  However, since the cache is holding on to all the 
> jobs, not much heap space can be freed up.
> By setting a property that sets a cap on the number of tasks allowed in the 
> cache and since the total number of tasks loaded is directly proportional to 
> the amount of heap used, this should help prevent the JHS from locking up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-02-22 Thread Ray Chiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157602#comment-15157602
 ] 

Ray Chiang commented on MAPREDUCE-6622:
---

RE: ASF license warnings

ASF license issues not related to patch.


> Add capability to set JHS job cache to a task-based limit
> -
>
> Key: MAPREDUCE-6622
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.7.2
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: supportability
> Attachments: MAPREDUCE-6622.001.patch, MAPREDUCE-6622.002.patch, 
> MAPREDUCE-6622.003.patch
>
>
> When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs 
> can be of varying size.  This is generally not a problem when the jobs sizes 
> are uniform or small, but when the job sizes can be very large (say greater 
> than 250k tasks), then the JHS heap size can grow tremendously.
> In cases, where multiple jobs are very large, then the JHS can lock up and 
> spend all its time in GC.  However, since the cache is holding on to all the 
> jobs, not much heap space can be freed up.
> By setting a property that sets a cap on the number of tasks allowed in the 
> cache and since the total number of tasks loaded is directly proportional to 
> the amount of heap used, this should help prevent the JHS from locking up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-02-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157570#comment-15157570
 ] 

Hadoop QA commented on MAPREDUCE-6622:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
5s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 33s 
{color} | {color:green} trunk passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 44s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
24s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 24s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
38s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
26s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s 
{color} | {color:green} trunk passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
5s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 36s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 36s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 46s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 46s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
22s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 16s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
34s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 0s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 50s 
{color} | {color:green} hadoop-mapreduce-client-core in the patch passed with 
JDK v1.8.0_72. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 37s 
{color} | {color:green} hadoop-mapreduce-client-common in the patch passed with 
JDK v1.8.0_72. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 5m 28s 
{color} | {color:green} hadoop-mapreduce-client-hs in the patch passed with JDK 
v1.8.0_72. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 10s 
{color} | {color:green} hadoop-mapreduce-client-core in the patch passed with 
JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 45s 
{color} | {color:green} 

[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-02-19 Thread Ray Chiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15154995#comment-15154995
 ] 

Ray Chiang commented on MAPREDUCE-6622:
---

For LinkedHashMap implementations, I have tried the following:

A. I first did an attempt at updating LinkedHashMap by just modifying 
removeEldestEntry().  That didn't work.

B. I just did a second attempt this morning at extending LinkedHashMap and 
adding the ability to track the total task count.  It tracks the task count 
correctly, but removeEldestEntry() only gets triggered once per put.  I don't 
see a good way to improve on that.


> Add capability to set JHS job cache to a task-based limit
> -
>
> Key: MAPREDUCE-6622
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.7.2
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: supportability
> Attachments: MAPREDUCE-6622.001.patch, MAPREDUCE-6622.002.patch
>
>
> When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs 
> can be of varying size.  This is generally not a problem when the jobs sizes 
> are uniform or small, but when the job sizes can be very large (say greater 
> than 250k tasks), then the JHS heap size can grow tremendously.
> In cases, where multiple jobs are very large, then the JHS can lock up and 
> spend all its time in GC.  However, since the cache is holding on to all the 
> jobs, not much heap space can be freed up.
> By setting a property that sets a cap on the number of tasks allowed in the 
> cache and since the total number of tasks loaded is directly proportional to 
> the amount of heap used, this should help prevent the JHS from locking up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-02-17 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151528#comment-15151528
 ] 

Karthik Kambatla commented on MAPREDUCE-6622:
-

Sorry for the delay in getting to this. Took a quick look at the patch, some 
high-level comments:
# Since this is the JobHistoryServer and no client code runs in it, I am 
comfortable with us using guava cache. I am aware of Ray's attempts at doing 
the same using the previous LinkedHashMap and it was fairly involved. I ll let 
him comment on the details here. 
# Since the cache automatically evicts jobs when it loads a new job, I don't 
see the need for the cleanup thread. Having an additional cleanup thread kind 
of goes against the reason we are using guava cache in the first place, 
simplicity. We could choose to use time-based eviction if we want, there is 
also a "ticker" we could use to test it. 
# Is okay to not honor {{loadedjobs.cache.size}} when 
{{loadedtasks.cache.size}} is specified? [~jlowe] - do you remember if the 
number of jobs was being used as a proxy for the memory usage? 
# What happens when a job with tasks more than the allowed value needs to be 
loaded? Can we add a test to verify that case? 

> Add capability to set JHS job cache to a task-based limit
> -
>
> Key: MAPREDUCE-6622
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.7.2
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: supportability
> Attachments: MAPREDUCE-6622.001.patch, MAPREDUCE-6622.002.patch
>
>
> When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs 
> can be of varying size.  This is generally not a problem when the jobs sizes 
> are uniform or small, but when the job sizes can be very large (say greater 
> than 250k tasks), then the JHS heap size can grow tremendously.
> In cases, where multiple jobs are very large, then the JHS can lock up and 
> spend all its time in GC.  However, since the cache is holding on to all the 
> jobs, not much heap space can be freed up.
> By setting a property that sets a cap on the number of tasks allowed in the 
> cache and since the total number of tasks loaded is directly proportional to 
> the amount of heap used, this should help prevent the JHS from locking up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-02-01 Thread Ray Chiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15126860#comment-15126860
 ] 

Ray Chiang commented on MAPREDUCE-6622:
---

While I do have some doubts about general Guava library compatibility, I don't 
have a lot with their Cache library.  While I was trying to figure out how to 
implement this, I wrote up some separate code using the latest (v19) and the 
same API calls worked fine in the upstream tree (v11).

> Add capability to set JHS job cache to a task-based limit
> -
>
> Key: MAPREDUCE-6622
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.7.2
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: supportability
> Attachments: MAPREDUCE-6622.001.patch, MAPREDUCE-6622.002.patch
>
>
> When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs 
> can be of varying size.  This is generally not a problem when the jobs sizes 
> are uniform or small, but when the job sizes can be very large (say greater 
> than 250k tasks), then the JHS heap size can grow tremendously.
> In cases, where multiple jobs are very large, then the JHS can lock up and 
> spend all its time in GC.  However, since the cache is holding on to all the 
> jobs, not much heap space can be freed up.
> By setting a property that sets a cap on the number of tasks allowed in the 
> cache and since the total number of tasks loaded is directly proportional to 
> the amount of heap used, this should help prevent the JHS from locking up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-01-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15123276#comment-15123276
 ] 

Hadoop QA commented on MAPREDUCE-6622:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 45s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
50s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 27s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 44s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
25s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 19s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
36s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
38s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 23s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
3s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 35s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 35s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 43s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 43s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
21s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 15s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
33s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 0s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 54s 
{color} | {color:green} hadoop-mapreduce-client-core in the patch passed with 
JDK v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 38s 
{color} | {color:green} hadoop-mapreduce-client-common in the patch passed with 
JDK v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 49m 6s {color} 
| {color:red} hadoop-mapreduce-client-hs in the patch failed with JDK 
v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 30s 
{color} | {color:green} hadoop-mapreduce-client-core in the patch passed with 
JDK v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 54s 
{color} | {color:green} 

[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-01-29 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15123592#comment-15123592
 ] 

Jason Lowe commented on MAPREDUCE-6622:
---

Thanks for the patch, Ray!

The documentation for the existing property for limiting based on job count 
needs to be updated to mention it is ignored if the new property is set.

Seems like it would be straightforward to pull out old jobs until we've freed 
up enough tasks to pay for the job trying to be added to the cache.  Now we 
have a background thread with a new property to configure how often it cleans 
-- does the cache blow way out of proportion if we don't clean up fast enough 
(e.g.: history server gets programmatically hammered for many jobs)? Adding yet 
another guava dependency isn't appealing unless really necessary.  If we are 
sticking with the guava cache, is it essential to call cleanUp in a background 
thread or won't this cleanup automatically happen as new jobs are loaded into 
the cache?



> Add capability to set JHS job cache to a task-based limit
> -
>
> Key: MAPREDUCE-6622
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.7.2
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: supportability
> Attachments: MAPREDUCE-6622.001.patch
>
>
> When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs 
> can be of varying size.  This is generally not a problem when the jobs sizes 
> are uniform or small, but when the job sizes can be very large (say greater 
> than 250k tasks), then the JHS heap size can grow tremendously.
> In cases, where multiple jobs are very large, then the JHS can lock up and 
> spend all its time in GC.  However, since the cache is holding on to all the 
> jobs, not much heap space can be freed up.
> By setting a property that sets a cap on the number of tasks allowed in the 
> cache and since the total number of tasks loaded is directly proportional to 
> the amount of heap used, this should help prevent the JHS from locking up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-01-29 Thread Ray Chiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15123892#comment-15123892
 ] 

Ray Chiang commented on MAPREDUCE-6622:
---

bq. The documentation for the existing property for limiting based on job count 
needs to be updated to mention it is ignored if the new property is set.

Will do.

bq. Seems like it would be straightforward to pull out old jobs until we've 
freed up enough tasks to pay for the job trying to be added to the cache. Now 
we have a background thread with a new property to configure how often it 
cleans – does the cache blow way out of proportion if we don't clean up fast 
enough (e.g.: history server gets programmatically hammered for many jobs)? 
Adding yet another guava dependency isn't appealing unless really necessary. If 
we are sticking with the guava cache, is it essential to call cleanUp in a 
background thread or won't this cleanup automatically happen as new jobs are 
loaded into the cache?

I ended up having to call cleanUp() in order to get the unit tests to pass, but 
those admittedly run in a very short amount of time.  There's definitely some 
lack of determinism in that size() returns an approximate size (according to 
the documentation).  I'd say my biggest concern is that GC without any explicit 
cache churn (i.e. users clicking on job links) won't force the cache to 
explicitly cleanup for when you have several really large jobs, but the 
background thread would.

I could change the cache setting to mean:
- -1 Never call cleanUp() explicitly
- 0 Always call cleanUp() explicitly after each write
- >0 Run in cleanUp() in a periodic thread

And I didn't like the move to Guava either, but on the bright side, it looks 
like Java 8 ConcurrentHashMap+lambdas can mimic a basic cache.  Some of the 
less sophisticated usages can move to that when JDK8 becomes the baseline.


> Add capability to set JHS job cache to a task-based limit
> -
>
> Key: MAPREDUCE-6622
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.7.2
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: supportability
> Attachments: MAPREDUCE-6622.001.patch
>
>
> When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs 
> can be of varying size.  This is generally not a problem when the jobs sizes 
> are uniform or small, but when the job sizes can be very large (say greater 
> than 250k tasks), then the JHS heap size can grow tremendously.
> In cases, where multiple jobs are very large, then the JHS can lock up and 
> spend all its time in GC.  However, since the cache is holding on to all the 
> jobs, not much heap space can be freed up.
> By setting a property that sets a cap on the number of tasks allowed in the 
> cache and since the total number of tasks loaded is directly proportional to 
> the amount of heap used, this should help prevent the JHS from locking up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-01-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124781#comment-15124781
 ] 

Hadoop QA commented on MAPREDUCE-6622:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 24s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
34s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 31s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 42s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
25s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 21s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
39s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
34s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 25s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
2s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 23s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 23s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 38s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 38s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
22s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 13s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
33s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 4s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 48s 
{color} | {color:green} hadoop-mapreduce-client-core in the patch passed with 
JDK v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 36s 
{color} | {color:green} hadoop-mapreduce-client-common in the patch passed with 
JDK v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 5m 30s 
{color} | {color:green} hadoop-mapreduce-client-hs in the patch passed with JDK 
v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 7s 
{color} | {color:green} hadoop-mapreduce-client-core in the patch passed with 
JDK v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 45s 
{color} | {color:green} 

[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-01-29 Thread Ray Chiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124791#comment-15124791
 ] 

Ray Chiang commented on MAPREDUCE-6622:
---

RE: ASF license warning

Same set of files as earlier.  None of the files come from the patch.

> Add capability to set JHS job cache to a task-based limit
> -
>
> Key: MAPREDUCE-6622
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.7.2
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: supportability
> Attachments: MAPREDUCE-6622.001.patch, MAPREDUCE-6622.002.patch
>
>
> When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs 
> can be of varying size.  This is generally not a problem when the jobs sizes 
> are uniform or small, but when the job sizes can be very large (say greater 
> than 250k tasks), then the JHS heap size can grow tremendously.
> In cases, where multiple jobs are very large, then the JHS can lock up and 
> spend all its time in GC.  However, since the cache is holding on to all the 
> jobs, not much heap space can be freed up.
> By setting a property that sets a cap on the number of tasks allowed in the 
> cache and since the total number of tasks loaded is directly proportional to 
> the amount of heap used, this should help prevent the JHS from locking up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-01-29 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15123973#comment-15123973
 ] 

Jason Lowe commented on MAPREDUCE-6622:
---

My main point was that I don't think it would be hard to avoid both the 
undesirable extra Guava dependency and not-quite-deterministic behavior of the 
cache.  Seems like it would be pretty straightforward to maintain the cache 
ourselves with the LinkedHashMap to help track what hasn't been used recently.  
Did you look into that approach and abandon it?

> Add capability to set JHS job cache to a task-based limit
> -
>
> Key: MAPREDUCE-6622
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.7.2
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: supportability
> Attachments: MAPREDUCE-6622.001.patch
>
>
> When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs 
> can be of varying size.  This is generally not a problem when the jobs sizes 
> are uniform or small, but when the job sizes can be very large (say greater 
> than 250k tasks), then the JHS heap size can grow tremendously.
> In cases, where multiple jobs are very large, then the JHS can lock up and 
> spend all its time in GC.  However, since the cache is holding on to all the 
> jobs, not much heap space can be freed up.
> By setting a property that sets a cap on the number of tasks allowed in the 
> cache and since the total number of tasks loaded is directly proportional to 
> the amount of heap used, this should help prevent the JHS from locking up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-01-29 Thread Ray Chiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124003#comment-15124003
 ] 

Ray Chiang commented on MAPREDUCE-6622:
---

bq. We should change if (loadedTasksCacheSize==-1) { to if 
(loadedTasksCacheSize<=-1) {. Otherwise the user will get an 
IllegalArgumentException when it tries to do the CacheBuilder stuff. This way, 
it will revert to the old behavior, which is nicer.

Will do.

bq. Can we call lruJobTracker something else? JobTracker is used enough in 
Hadoop 1 

Good point.  I used a clearly different variable name while trying out 
different implementations.  I could switch back to the old name of 
loadedJobCache.

> Add capability to set JHS job cache to a task-based limit
> -
>
> Key: MAPREDUCE-6622
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.7.2
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: supportability
> Attachments: MAPREDUCE-6622.001.patch
>
>
> When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs 
> can be of varying size.  This is generally not a problem when the jobs sizes 
> are uniform or small, but when the job sizes can be very large (say greater 
> than 250k tasks), then the JHS heap size can grow tremendously.
> In cases, where multiple jobs are very large, then the JHS can lock up and 
> spend all its time in GC.  However, since the cache is holding on to all the 
> jobs, not much heap space can be freed up.
> By setting a property that sets a cap on the number of tasks allowed in the 
> cache and since the total number of tasks loaded is directly proportional to 
> the amount of heap used, this should help prevent the JHS from locking up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-01-29 Thread Ray Chiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15123999#comment-15123999
 ] 

Ray Chiang commented on MAPREDUCE-6622:
---

Oh, I see.

I did try the simplest approach of overriding 
LinkedHashMap#removeEldestEntry(), but an alternate measure like Cache weight 
(or in this specific case Task Count) didn't update.  The method seems to allow 
the CachedHistoryStorage member variables to be visible within the method, but 
not modifiable.

I admittedly did not try something more sophisticated with LinkedHashMap before 
jumping to the Guava cache implementation.

> Add capability to set JHS job cache to a task-based limit
> -
>
> Key: MAPREDUCE-6622
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.7.2
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: supportability
> Attachments: MAPREDUCE-6622.001.patch
>
>
> When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs 
> can be of varying size.  This is generally not a problem when the jobs sizes 
> are uniform or small, but when the job sizes can be very large (say greater 
> than 250k tasks), then the JHS heap size can grow tremendously.
> In cases, where multiple jobs are very large, then the JHS can lock up and 
> spend all its time in GC.  However, since the cache is holding on to all the 
> jobs, not much heap space can be freed up.
> By setting a property that sets a cap on the number of tasks allowed in the 
> cache and since the total number of tasks loaded is directly proportional to 
> the amount of heap used, this should help prevent the JHS from locking up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-01-29 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15123989#comment-15123989
 ] 

Robert Kanter commented on MAPREDUCE-6622:
--

If we add the -1, 0, and >0 for the cleanUp(), make sure to explain the 
implications of that in the description.

A few other small things:
# We should change {{if (loadedTasksCacheSize==-1) \{}} to {{if 
(loadedTasksCacheSize<=-1) \{}}.  Otherwise the user will get an 
{{IllegalArgumentException}} when it tries to do the {{CacheBuilder}} stuff.  
This way, it will revert to the old behavior, which is nicer.
# Can we call {{lruJobTracker}} something else?  {{JobTracker}} is used enough 
in Hadoop 1 :)


> Add capability to set JHS job cache to a task-based limit
> -
>
> Key: MAPREDUCE-6622
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.7.2
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: supportability
> Attachments: MAPREDUCE-6622.001.patch
>
>
> When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs 
> can be of varying size.  This is generally not a problem when the jobs sizes 
> are uniform or small, but when the job sizes can be very large (say greater 
> than 250k tasks), then the JHS heap size can grow tremendously.
> In cases, where multiple jobs are very large, then the JHS can lock up and 
> spend all its time in GC.  However, since the cache is holding on to all the 
> jobs, not much heap space can be freed up.
> By setting a property that sets a cap on the number of tasks allowed in the 
> cache and since the total number of tasks loaded is directly proportional to 
> the amount of heap used, this should help prevent the JHS from locking up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-01-29 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124151#comment-15124151
 ] 

Karthik Kambatla commented on MAPREDUCE-6622:
-

If we do decide to continue with using the guava cache, I would like to review 
the patch before it gets committed. 

> Add capability to set JHS job cache to a task-based limit
> -
>
> Key: MAPREDUCE-6622
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.7.2
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: supportability
> Attachments: MAPREDUCE-6622.001.patch
>
>
> When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs 
> can be of varying size.  This is generally not a problem when the jobs sizes 
> are uniform or small, but when the job sizes can be very large (say greater 
> than 250k tasks), then the JHS heap size can grow tremendously.
> In cases, where multiple jobs are very large, then the JHS can lock up and 
> spend all its time in GC.  However, since the cache is holding on to all the 
> jobs, not much heap space can be freed up.
> By setting a property that sets a cap on the number of tasks allowed in the 
> cache and since the total number of tasks loaded is directly proportional to 
> the amount of heap used, this should help prevent the JHS from locking up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-01-29 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124148#comment-15124148
 ] 

Karthik Kambatla commented on MAPREDUCE-6622:
-

I have used the guava cache before (can't remember where), and have found it to 
be very useful. Except for the compatibility concerns in guava, I don't mind us 
using it on the server side at all. The client is a different story, but I 
don't think we need to worry about that here. [~jlowe], [~rchiang] - did you 
guys have any other specific concerns with using guava that I am discounting? 

bq. I ended up having to call cleanUp() in order to get the unit tests to pass, 
but those admittedly run in a very short amount of time.
I don't see the need to do cleanups to ensure the unit tests pass. 

On how often to clean up, the cache considers eviction on every load. If this 
is going to be a cache with frequent loads, we don't need to have another 
thread doing the cleanup. [~rchiang] - in your test, did you consider 
continuously loading new jobs? If no new jobs are loaded for a while, the 
likelihood of the cache cleaning up is low. Even if it does, it will be on 
read. But, do we need to evict jobs if we are not loading any new ones? 

> Add capability to set JHS job cache to a task-based limit
> -
>
> Key: MAPREDUCE-6622
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.7.2
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: supportability
> Attachments: MAPREDUCE-6622.001.patch
>
>
> When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs 
> can be of varying size.  This is generally not a problem when the jobs sizes 
> are uniform or small, but when the job sizes can be very large (say greater 
> than 250k tasks), then the JHS heap size can grow tremendously.
> In cases, where multiple jobs are very large, then the JHS can lock up and 
> spend all its time in GC.  However, since the cache is holding on to all the 
> jobs, not much heap space can be freed up.
> By setting a property that sets a cap on the number of tasks allowed in the 
> cache and since the total number of tasks loaded is directly proportional to 
> the amount of heap used, this should help prevent the JHS from locking up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)