[
https://issues.apache.org/jira/browse/HADOOP-14523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xiao Chen reassigned HADOOP-14523:
----------------------------------
Assignee: Misha Dmitriev
> OpensslAesCtrCryptoCodec.finalize() holds excessive amounts of memory
> ---------------------------------------------------------------------
>
> Key: HADOOP-14523
> URL: https://issues.apache.org/jira/browse/HADOOP-14523
> Project: Hadoop Common
> Issue Type: Improvement
> Reporter: Misha Dmitriev
> Assignee: Misha Dmitriev
>
> I recently analyzed JVM heap dumps from Hive running a big workload. Two
> excerpts from the analysis done with jxray (www.jxray.com) are given below.
> It turns out that nearly a half of live memory is taken by objects awaiting
> finalization, and the biggest offender among them is class
> OpensslAesCtrCryptoCodec:
> {code}
> 401,189K (39.7%) (1 of sun.misc.Cleaner)
> <-- Java Static: sun.misc.Cleaner.first
> 400,572K (39.6%) (14001 of
> org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec,
> org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager, java.util.jar.JarFile etc.)
> <-- j.l.r.Finalizer.referent <-- j.l.r.Finalizer.{next} <--
> sun.misc.Cleaner.next <-- sun.misc.Cleaner.{next} <-- Java Static:
> sun.misc.Cleaner.first
> 270,673K (26.8%) (2138 of org.apache.hadoop.mapred.JobConf)
> <-- org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec.conf <--
> j.l.r.Finalizer.referent <-- j.l.r.Finalizer.{next} <-- sun.misc.Cleaner.next
> <-- sun.misc.Cleaner.{next} <-- Java Static: sun.misc.Cleaner.first
> ---------------------
> 102,232K (10.1%) (1 of j.l.r.Finalizer)
> <-- Java Static: java.lang.ref.Finalizer.unfinalized
> 101,676K (10.1%) (8613 of
> org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec,
> java.util.zip.ZipFile$ZipFileInflaterInputStream,
> org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager etc.)
> <-- j.l.r.Finalizer.referent <-- j.l.r.Finalizer.{next} <-- Java Static:
> java.lang.ref.Finalizer.unfinalized
> {code}
> This heap dump was taken using 'jmap -dump:live', which forces the JVM to run
> full GC before dumping the heap. So we are already looking at the heap right
> after GC, and yet all these unfinalized objects are there. I think this
> happens because the JVM always runs only one finalization thread, and thus
> the queue of objects that need finalization may get processed too slowly. My
> understanding is that finalization works as follows:
> 1. When GC runs, it discovers that object x that overrides finalize() is
> unreachable.
> 2. x is added to the finalization queue. So technically x is still reachable,
> it occupies memory, and _all the objects that it references stay in memory as
> well_.
> 3. The finalization thread processes objects from the finalization queue
> serially, thus x may stay in memory for long time.
> 4. x.finalize() is invoked, then x is made unreachable. If x stayed in memory
> for long time, it's now in Old Gen of the heap, so only full GC can clean it
> up.
> 5. When full GC finally occurs, x gets cleaned up.
> So finalization is formally reliable, but in practice it's quite possible
> that a lot of unreachable, but unfinalized objects flood the memory. I guess
> we are seeing all these OpensslAesCtrCryptoCodec objects when they are in
> phase 3 above. And the really bad thing is that these objects in turn keep in
> memory a whole lot of other stuff, in particular JobConf objects. Such a
> JobConf has nothing to do with finalization, yet the GC cannot release it
> until the corresponding OpensslAesCtrCryptoCodec's is gone.
> Here is OpensslAesCtrCryptoCodec.finalize() method with my comments:
> {code}
> protected void finalize() throws Throwable {
> try {
> Closeable r = (Closeable) this.random;
> r.close(); // Relevant only when (random instanceof OsSecureRandom ==
> true)
> } catch (ClassCastException e) {
> }
> super.finalize(); // Not needed, no finalize() in superclasses
> }
> {code}
> So, finalize() in this class, that may keep in memory a whole tree of
> objects, is relevant only when this codec is configured to use OsSecureRandom
> class. The latter reads random bytes from the configured file, and needs
> finalization to close the input stream associated with that file.
> The suggested fix is to remove finalize() from OpensslAesCtrCryptoCodec and
> add it to the only class from this "family" that really needs it,
> OsSecureRandom. That will ensure that only OsSecureRandom objects (if/when
> they are used) stay in memory awaiting finalization, and no other, irrelevant
> objects.
> Note that this solution means that streams are still closed lazily. This, in
> principle, may cause its own problems. So the most reliable fix would be to
> call OsSecureRandom.close() explicitly when it's not needed anymore. But the
> above fix is a necessary first step anyway, it will remove the most acute
> problem with memory and will not make any other things worse than they
> currently are.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]