Birger Brunswiek created HIVE-16949:
---------------------------------------

             Summary: Leak of threads from Get-Input-Paths thread pool when 
more than 1 used in query
                 Key: HIVE-16949
                 URL: https://issues.apache.org/jira/browse/HIVE-16949
             Project: Hive
          Issue Type: Bug
          Components: HiveServer2
            Reporter: Birger Brunswiek


The commit 7f1c29ebe which was part of HIVE-15881 introduced a thread pool for 
which is not shutdown upon completion of its threads. This leads to a leak of 
threads. They are not removed by the GC. When queries spanning multiple 
partitions are made the number of threads increases and is never reduced. On my 
machine hiveserver2 starts to get slower and slower once 10k threads are 
reached.

Thread pools should be should be [shutdown 
automatically|https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ThreadPoolExecutor.html].
 I am not sure why this is not the case. I would add a _pool.shutdown()_ just 
[after the pool has completed its 
work|https://github.com/apache/hive/blob/824b9c80b443dc4e2b9ad35214a23ac756e75234/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L3137]
 to make sure the threads are really shutdown.

My current workaround is to set {{hive.exec.input.listing.max.threads = 1}}. 
This prevents the the thread pool from being spawned 
[\[1\]|https://github.com/apache/hive/blob/824b9c80b443dc4e2b9ad35214a23ac756e75234/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L2118]
 
[\[2\]|https://github.com/apache/hive/blob/824b9c80b443dc4e2b9ad35214a23ac756e75234/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L3107].





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to