Memory leak in TSaslServerTransport
-----------------------------------
Key: THRIFT-1468
URL: https://issues.apache.org/jira/browse/THRIFT-1468
Project: Thrift
Issue Type: Bug
Components: Java - Library
Affects Versions: 0.5, 0.9
Reporter: Mithun Radhakrishnan
I'm working on the HCatalog project. HCatalog uses a (slightly dated) version
of Hive that in turn depends on libthrift-0.5.0. The HCatalog-server is a
continuously running process that serves (meta)data over thrift. (The bug I
describe is related to HCATALOG-183.)
We observed that on running the HCatalog-server with continuous
client-requests, the memory footprint of the server grows steadily, until we
see an OutOfMemoryError exception. I took a memory snapshot of the running
process, to check for leaks. I noticed that the majority of the memory (over
1.3GB) was being consumed by the
org.apache.thrift.transport.TSaslServerTransport$Factory::transportMap. There
were over 52000 instances of WeakHashMap$Entry, consuming 3MB of shallow-heap,
and 1.3GB of retained heap.
I suspect that entries in the WeakHashMap (transportMap) are not being
collected during GC, as is expected in code. That would only be so if there are
outstanding hard-references to the key in the map (TTransport).
>From the code in TSaslTransport and TSaslServerTransport, it appears that
>there is an inadvertent cyclic reference that the runtime is unable to detect:
1. TSaslTransport has a (hard) back-reference to the "underlyingTransport",
i.e. TTransport.
2. TSaslServerTransport::Factory::transportMap is a WeakHashMap< TTransport,
TSaslServerTransport >. Here, the "underlyingTransport" is mapped back to the
decorating TSaslServerTransport.
>From #2, an entry can only be GCed if there's no outstanding hard-reference to
>the TTransport. But from #1, the hard-reference comes from the value-part of
>the hashmap entry. The runtime can't deduce that there's a cycle, presumably
>because it's not explicit.
(I'll be attaching a sample program to better illustrate the WeakHashMap
behaviour, in case I've botched the explanation above.)
The simple solution would be to change the back-reference in #1 into a
WeakReference. I'll attach a patch here that might be suitable.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira