[
https://issues.apache.org/jira/browse/MAPREDUCE-4303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Joseph Evans resolved MAPREDUCE-4303.
--------------------------------------------
Resolution: Duplicate
> Look at using String.intern to dedupe some Strings
> --------------------------------------------------
>
> Key: MAPREDUCE-4303
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4303
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: applicationmaster
> Affects Versions: 0.23.3, 2.0.0-alpha
> Reporter: Robert Joseph Evans
>
> MAPREDUCE-4301 fixes one issue with too many duplicate strings, but there are
> other places where it is not as simple to remove the duplicates. In these
> cases the source of the strings is an incoming RPC call or from parsing and
> reading in a file. The only real way to dedupe these is to either use
> String.intern() which if not used properly could result in the permgen space
> being filled up, or by playing games with our own cache, and trying to do the
> same sort of thing as String.intern, but in the heap.
> The following are some that I saw lots of duplicate strings that we should
> look at doing something about.
> TaskAttemptStatusUpdateEvent$TaskAttemptState.stateString
> MapTaskAttemptImpl.diagnostics
> The keys to Counters.groups
> GenericGroup.displayName
> The keys to GenericGroup.counters
> and GenericCounter.displayName
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira