[
https://issues.apache.org/jira/browse/MAPREDUCE-4229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13478314#comment-13478314
]
Robert Joseph Evans commented on MAPREDUCE-4229:
------------------------------------------------
Sorry it has taken me so long to respond. I have been a bit swamped lately.
The new patch looks really good. It is simple and looks like it could help a
lot with the memory usage. Do you have any actual heap comparisons that you
can show us? I just have a difficult time checking in a "performance" fix
without some test, manual or otherwise to show the impact it is having and if
there is still more that could be done in a follow up JIRA. I know that
YourKit profiler has some nice Heap Dump analysis to look for duplicate
strings. If you have some numbers ready that would be great otherwise I will
try and find some time this week to see if I can come up with anything.
> Intern counter names in the JT
> ------------------------------
>
> Key: MAPREDUCE-4229
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4229
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: jobtracker
> Affects Versions: 1.0.2, 3.0.0, 2.0.2-alpha
> Reporter: Todd Lipcon
> Attachments: MAPREDUCE-4229-branch-0.23.patch, MAPREDUCE-4229.patch
>
>
> In our experience, most of the memory in production JTs goes to storing
> counter names (String objects and character arrays). Since most counter names
> are reused again and again, it would be a big memory savings to keep a hash
> set of already-used counter names within a job, and refer to the same object
> from all tasks.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira