[
https://issues.apache.org/jira/browse/HIVE-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15296454#comment-15296454
]
Naveen Gangam commented on HIVE-13749:
--------------------------------------
[~thejas] I have a better understanding of what is causing this issue. It
appears that FileSystem.Cache (hadoop APIs) is retaining the instances of
Configuration in its cache.
Anytime we call a FileSystem.get(conf), like so
https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L1685
the conf object becomes the part of the key for the map entry. Its meant to
improve performance so we dont have to re-create these FileSystem objects, but
doesnt appear that Hive's use of these APIs is using the cache efficiently.
There are other areas in the code that contribute, like Path.getFileSystem()
under the covers could add to this cache.
https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java#L104
Caching can be turned off entirely by using
fs.%s.impl.disable.cache=true where %s is the caching scheme (ex: hdfs or s3)
which might make this problem go away but has a performance overhead. (I havent
measured it though).
Unfortunately, there is no means to selectively turn off the caching on a per
call basis. So we have to fix this in the hive code. fs.close() would remove
the entry from the cache. But we cannot call it every time we use this API, as
it would be the same as disabling the cache entirely. So its easy choice to add
fs.close() here
https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L1685
But for the other code in Warehouse, we need more data around the cache hits
and misses. I am working on instrumenting the FileSystem code to provide this
info.
Alternate thought, (I am not sure how feasible it is though), since the
FileSystem code does not appear to be using the properties within this
Configuration object itself, it may be safe to use a static instance of
HiveConf on most calls to FileSystem, like mkdirs(), get() etc. This way we use
the cache efficiently too. However, I am not sure if there will be session
specific properties that get used across all calls to the FileSystem APIs.
Thoughts? Thanks in advance.
> Memory leak in Hive Metastore
> -----------------------------
>
> Key: HIVE-13749
> URL: https://issues.apache.org/jira/browse/HIVE-13749
> Project: Hive
> Issue Type: Bug
> Components: Metastore
> Affects Versions: 1.1.0
> Reporter: Naveen Gangam
> Assignee: Naveen Gangam
> Attachments: Top_Consumers7.html
>
>
> Looking a heap dump of 10GB, a large number of Configuration objects(> 66k
> instances) are being retained. These objects along with its retained set is
> occupying about 95% of the heap space. This leads to HMS crashes every few
> days.
> I will attach an exported snapshot from the eclipse MAT.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)