[
https://issues.apache.org/jira/browse/HIVE-16879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16400630#comment-16400630
]
Misha Dmitriev commented on HIVE-16879:
---------------------------------------
I agree about the negligible CPU performance impact of String.intern(),
especially when compared with reduced heap size and GC time. Again, I think
this is a good change, assuming that it's applied in the right place.
However, my experience is that guessing doesn't always work when you try to
determine where _exactly_ memory is wasted. Do you have access to some running
Hive instances where you would expect this to be a problem? Then, at a minimum,
you can run 'jmap -histo:live' to get the number of Key instances and roughly
estimate memory used by the strings that Keys reference. And the best thing
would be to take a heap dump (jmap -dump:live,format=b,...) and analyze it with
a tool, e.g. [www.jxray.com,|http://www.jxray.com,/] that immediately tells you
the memory overhead of duplicate strings. You will immediately see whether Keys
cause noticeable overhead, and/or what other classes cause it.
> Improve Cache Key
> -----------------
>
> Key: HIVE-16879
> URL: https://issues.apache.org/jira/browse/HIVE-16879
> Project: Hive
> Issue Type: Improvement
> Components: Metastore
> Affects Versions: 3.0.0
> Reporter: BELUGA BEHR
> Assignee: BELUGA BEHR
> Priority: Trivial
> Attachments: HIVE-16879.1.patch, HIVE-16879.2.patch
>
>
> Improve cache key for cache implemented in
> {{org.apache.hadoop.hive.metastore.AggregateStatsCache}}.
> # Cache some of the key components themselves (db name, table name) using
> {{String}} intern method to conserve memory for repeated keys, to improve
> {{equals}} method as now references can be used for equality, and hashcodes
> will be cached as well as per {{String}} clash hashcode method.
> # Upgrade _debug_ logging to not generate text unless required
> # Changed _equals_ method to check first for the item most likely to be
> different, column name
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)