[ https://issues.apache.org/jira/browse/HIVE-6268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13880136#comment-13880136 ]
Sushanth Sowmyan commented on HIVE-6268: ---------------------------------------- To fix this, we can do two things : a) Provide a jobconf parameter that allows users to disable usage of the cache altogether. Useful for massively multithreaded cases. b) In the cases that use a cache, we should spawn a separate maintenance thread that will prune and expire from time to time. Attaching a patch which does both of the above. > Network resource leak with HiveClientCache when using HCatInputFormat > --------------------------------------------------------------------- > > Key: HIVE-6268 > URL: https://issues.apache.org/jira/browse/HIVE-6268 > Project: Hive > Issue Type: Bug > Components: HCatalog > Affects Versions: 0.12.0 > Reporter: Sushanth Sowmyan > Assignee: Sushanth Sowmyan > Attachments: HIVE-6268.patch > > > HCatInputFormat has a cache feature that allows HCat to cache hive client > connections to the metastore, so as to not keep reinstantiating a new hive > server every single time. This uses a guava cache of hive clients, which only > evicts entries from cache on the next write, or by manually managing the > cache. > So, in a single threaded case, where we reuse the hive client, the cache > works well, but in a massively multithreaded case, where each thread might > perform one action, and then is never used, there are no more writes to the > cache, and all the clients stay alive, thus keeping ports open. -- This message was sent by Atlassian JIRA (v6.1.5#6160)