[
https://issues.apache.org/jira/browse/MAPREDUCE-3944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221752#comment-13221752
]
Vinod Kumar Vavilapalli commented on MAPREDUCE-3944:
----------------------------------------------------
One question: Do you have MAPREDUCE-3901 part of your install? IIUC, that patch
will alleviate the problem A LOT.
Course of action IMO
(1) If the issue still exists after MAPREDUCE-3901, I agree that first next
thing to do is to use PartialJobs instead of completedJobs.
(2) MAPREDUCE-3966 also should alleviate the issue, once we increase the
cache-size for jobs without tasks loaded.
(3) To avoid the too-many-jobs-to-return case, the correct solution is a
simple cursor with hard limits: The user can for example ask jobs from
startTimeA to startTimeB and pass a running cursor-position. We can do this in
a separate ticket.
@Robert
bq. If we are using LRU to remove entries the current code will loop through
all entries.
Where is this?
> JobHistory web services are slower then the UI and can easly overload the JH
> ----------------------------------------------------------------------------
>
> Key: MAPREDUCE-3944
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3944
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: mrv2
> Affects Versions: 0.23.1, 0.23.2
> Reporter: Robert Joseph Evans
> Assignee: Robert Joseph Evans
> Priority: Blocker
>
> When our first customer started using the Job History web services today the
> History Server ground to a halt. We found 250 Jetty threads stuck on the
> following stack trace.
> {noformat}
> java.lang.Thread.State: BLOCKED (on object monitor)
> at
> org.apache.hadoop.mapreduce.v2.hs.JobHistory.getJob(JobHistory.java:898)
> - waiting to lock <0x00002aaab364ba60> (a
> org.apache.hadoop.mapreduce.v2.hs.JobHistory)
> at
> org.apache.hadoop.mapreduce.v2.hs.webapp.HsWebServices.getJobs(HsWebServices.java:188)
> {noformat}
> HsWebServices.java:188 corresponds to the /mapreduce/jobs service.
> Looking at the code there are a number of optimizations that need to be done
> to improve its performance.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira