[
https://issues.apache.org/jira/browse/MAPREDUCE-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Siddharth Seth updated MAPREDUCE-3901:
--------------------------------------
Attachment: MR3901.txt
Straight forward patch. Adds a couple of unit tests for
Completed{Job/Task/TaskAttempt}.
Also fixes the completedJobCache in jobHistory to be an LRU cache.
Numbers when loading a 70MB, 11700 task history file (10 node cluster)
ParseTime: ~4.5 seconds
Creating all Task objects: ~11.3 seconds (This comes down to ~4 seconds with a
patch for MAPREDUCE-2855)
Loading the full job: ~15.8 seconds.
The patch defers task and task attempt creation till they're required.
ParseTime: Remains the same - 4.5 seconds.
Creating all task objects: <200ms (Loaded in the UI execution path)
Loading the full job: < 5 seconds (for the UI and getJobReport)
> lazy load JobHistory Task and TaskAttempt details
> -------------------------------------------------
>
> Key: MAPREDUCE-3901
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3901
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: jobhistoryserver, mrv2
> Affects Versions: 0.23.0
> Reporter: Siddharth Seth
> Assignee: Siddharth Seth
> Attachments: MR3901.txt
>
>
> The job history UI and MRClientProtocol calls routed via JobHistory are very
> slow for large jobs. Some of this time is spent parsing the history file. A
> good chunk is spent pre-creating lots of objects which may never be used.
> Those can be create when required - bringing down the load times of job
> history pages and getJobReport etc calls to approximately the history file
> parse time.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira