[ 
https://issues.apache.org/jira/browse/HADOOP-14688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16139371#comment-16139371
 ] 

Misha Dmitriev commented on HADOOP-14688:
-----------------------------------------

[~daryn]: when a live heap dump is captured, as done here, a full GC is 
performed before a heap snapshot is taken. So if the given application produces 
objects that are very short-lived, i.e. quickly become garbage, then we will 
only see those of them that are live at the moment, which is typically not 
much. Conversely, most objects in a live heap dump tend to be relatively 
long-lived.

Furthermore, experience has shown that for reasonably long-lived strings, the 
CPU overhead of interning is small compared to the reduction in the memory 
pressure, reduced GC pauses, etc. That is, the cost of a fast internal 
String.intern() call is comparable to the cost of GC scanning and moving around 
all the extra copies of a string that remain in memory without interning.

> Intern strings in KeyVersion and EncryptedKeyVersion
> ----------------------------------------------------
>
>                 Key: HADOOP-14688
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14688
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: kms
>            Reporter: Xiao Chen
>            Assignee: Xiao Chen
>         Attachments: GC root of the String.png, HADOOP-14688.01.patch, 
> heapdump analysis.png, jxray.report
>
>
> This is inspired by [[email protected]]'s work on HDFS-11383.
> The key names and key version names are usually the same for a bunch of 
> {{KeyVersion}} and {{EncryptedKeyVersion}}. We should not create duplicate 
> objects for them.
> This is more important to HDFS-10899, where we try to re-encrypt all files' 
> EDEKs in a given EZ. Those EDEKs all has the same key name, and mostly using 
> no more than a couple of key version names.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to