[
https://issues.apache.org/jira/browse/MAPREDUCE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14617678#comment-14617678
]
Ray Chiang commented on MAPREDUCE-6394:
---------------------------------------
So, from everything I can tell, the HTML/Javascript Table generated in the for
loop of HsTasksBlock#render() ends up being slow for really large numbers of
Tasks (e.g. taking up 102 out of 109 seconds for my 751k tasks .jhist file).
The bulk of that time taken up in the method with the signature:
TypeConverter#toYarn(org.apache.hadoop.mapreduce.Counters counters)
Given the CounterGroup/Counters are pretty simple and from what I can figure
out through crude profiling is that the outer loop gets called some 4.5 million
times and the inner loop nets out around 37 million times. Even as fast as it
may be, the sheer number of counters adds up over so many tasks.
The only solution I can see offhand would be to defer this conversion to when
the user clicks on the Counters link in the Tasks page. This would mean
substituting the calls like:
report.setCounters(TypeConverter.toYarn(getCounters()));
in CompletedTaskAttempt (and similar places if any) and adjusting the JHS
method that accesses the counters to do the parsing at link click time.
> Speed up Task processing loop in HsTasksBlock#render()
> ------------------------------------------------------
>
> Key: MAPREDUCE-6394
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6394
> Project: Hadoop Map/Reduce
> Issue Type: Sub-task
> Components: jobhistoryserver
> Affects Versions: 2.7.0
> Reporter: Ray Chiang
> Assignee: Ray Chiang
> Labels: supportability
> Attachments: MAPREDUCE-6394.001.patch, MAPREDUCE-6394.002.patch,
> MAPREDUCE-6394.003.patch
>
>
> In HsTasksBlock#render(), there is a loop to create a Javascript table which
> slows down immensely on jobs with a large number of tasks (200k or more).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)