Birger Brunswiek created HIVE-16949:
---------------------------------------
Summary: Leak of threads from Get-Input-Paths thread pool when
more than 1 used in query
Key: HIVE-16949
URL: https://issues.apache.org/jira/browse/HIVE-16949
Project: Hive
Issue Type: Bug
Components: HiveServer2
Reporter: Birger Brunswiek
The commit 7f1c29ebe which was part of HIVE-15881 introduced a thread pool for
which is not shutdown upon completion of its threads. This leads to a leak of
threads. They are not removed by the GC. When queries spanning multiple
partitions are made the number of threads increases and is never reduced. On my
machine hiveserver2 starts to get slower and slower once 10k threads are
reached.
Thread pools should be should be [shutdown
automatically|https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ThreadPoolExecutor.html].
I am not sure why this is not the case. I would add a _pool.shutdown()_ just
[after the pool has completed its
work|https://github.com/apache/hive/blob/824b9c80b443dc4e2b9ad35214a23ac756e75234/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L3137]
to make sure the threads are really shutdown.
My current workaround is to set {{hive.exec.input.listing.max.threads = 1}}.
This prevents the the thread pool from being spawned
[\[1\]|https://github.com/apache/hive/blob/824b9c80b443dc4e2b9ad35214a23ac756e75234/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L2118]
[\[2\]|https://github.com/apache/hive/blob/824b9c80b443dc4e2b9ad35214a23ac756e75234/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L3107].
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)