Is Drill terminating threads correctly?

In running jstack on a JVM running a dev. test run that ended up hung
after getting about three test timeout errors, I see that there are
409 threads.

Although 138 of those are not-unexpected ShutdownHook threads (since
many tests are run in one VM), there are:
- 138 "WorkManager.StatusThread" threads (hmm.... 138 again)
-   7 "Client-1" threads
-   4 "UserServer-1" threads
-  21 "BitClient-1" threads
-   4 "BitClient-2" threads
-   3 "BitClient-3" threads
-   8 "BitServer-1" threads
-   8 "BitServer-2" threads
-   7 "BitServer-3" threads
-   7 "BitServer-4" threads
-   7 "BitServer-5" threads
-   6 "BitServer-6" threads
-   6 "BitServer-7" threads
-   6 "BitServer-8" threads
-   5 "BitServer-9" threads
-   5 "BitServer-10" threads
(Other thread names have only 1 or 2 occurrences.)

Regarding the 4 for the number of "UserServer-1" threads:  Three test
methods had timeout failures plus one got hung.


Here's the tail end of the output from the test running, including
all the timeout errors and including the hang (except for repeated
query-results data lines).



dbarclay@dev-linux2 ~/work/git/incubator-drill $ time mvn install

<TRIMMED>

Running org.apache.drill.exec.physical.impl.TestDistributedFragmentRun
Running 
org.apache.drill.exec.physical.impl.TestDistributedFragmentRun#oneBitOneExchangeOneEntryRun
Running 
org.apache.drill.exec.physical.impl.TestDistributedFragmentRun#twoBitOneExchangeTwoEntryRun
Running 
org.apache.drill.exec.physical.impl.TestDistributedFragmentRun#oneBitOneExchangeTwoEntryRun
Running 
org.apache.drill.exec.physical.impl.TestDistributedFragmentRun#oneBitOneExchangeTwoEntryRunLogical
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 48.117 sec - in 
org.apache.drill.exec.physical.impl.TestDistributedFragmentRun
Running org.apache.drill.exec.physical.impl.TestBroadcastExchange
Running 
org.apache.drill.exec.physical.impl.TestBroadcastExchange#TestSingleBroadcastExchangeWithTwoScans
00:44:34.017 [globalEventExecutor-1-523] ERROR 
o.a.z.server.NIOServerCnxnFactory - Thread 
Thread[globalEventExecutor-1-523,5,main] died
java.lang.AssertionError: null
        at 
io.netty.util.concurrent.AbstractScheduledEventExecutor.pollScheduledTask(AbstractScheduledEventExecutor.java:83)
 ~[netty-common-4.0.27.Final.jar:4.0.27.Final]
        at 
io.netty.util.concurrent.GlobalEventExecutor.fetchFromScheduledTaskQueue(GlobalEventExecutor.java:110)
 ~[netty-common-4.0.27.Final.jar:4.0.27.Final]
        at 
io.netty.util.concurrent.GlobalEventExecutor.takeTask(GlobalEventExecutor.java:95)
 ~[netty-common-4.0.27.Final.jar:4.0.27.Final]
        at 
io.netty.util.concurrent.GlobalEventExecutor$TaskRunner.run(GlobalEventExecutor.java:226)
 ~[netty-common-4.0.27.Final.jar:4.0.27.Final]
        at 
io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
 ~[netty-common-4.0.27.Final.jar:4.0.27.Final]
        at java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_72]
Running 
org.apache.drill.exec.physical.impl.TestBroadcastExchange#TestMultipleSendLocationBroadcastExchange
10000
Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 111.599 sec <<< 
FAILURE! - in org.apache.drill.exec.physical.impl.TestBroadcastExchange
TestSingleBroadcastExchangeWithTwoScans(org.apache.drill.exec.physical.impl.TestBroadcastExchange)
  Time elapsed: 50.063 sec  <<< ERROR!
java.lang.Exception: test timed out after 50000 milliseconds
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:503)
        at 
io.netty.util.concurrent.DefaultPromise.await(DefaultPromise.java:254)
        at io.netty.util.concurrent.DefaultPromise.await(DefaultPromise.java:32)
        at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:31)
        at org.apache.drill.exec.rpc.BasicServer.close(BasicServer.java:218)
        at com.google.common.io.Closeables.close(Closeables.java:77)
        at com.google.common.io.Closeables.closeQuietly(Closeables.java:108)
        at 
org.apache.drill.exec.rpc.data.DataConnectionCreator.close(DataConnectionCreator.java:70)
        at com.google.common.io.Closeables.close(Closeables.java:77)
        at com.google.common.io.Closeables.closeQuietly(Closeables.java:108)
        at 
org.apache.drill.exec.service.ServiceEngine.close(ServiceEngine.java:88)
        at com.google.common.io.Closeables.close(Closeables.java:77)
        at com.google.common.io.Closeables.closeQuietly(Closeables.java:108)
        at org.apache.drill.exec.server.Drillbit.close(Drillbit.java:288)
        at 
org.apache.drill.exec.physical.impl.TestBroadcastExchange.TestSingleBroadcastExchangeWithTwoScans(TestBroadcastExchange.java:62)

TestMultipleSendLocationBroadcastExchange(org.apache.drill.exec.physical.impl.TestBroadcastExchange)
  Time elapsed: 50.014 sec  <<< ERROR!
java.lang.Exception: test timed out after 50000 milliseconds
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:503)
        at 
io.netty.util.concurrent.DefaultPromise.await(DefaultPromise.java:254)
        at io.netty.util.concurrent.DefaultPromise.await(DefaultPromise.java:32)
        at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:31)
        at org.apache.drill.exec.rpc.BasicServer.close(BasicServer.java:218)
        at org.apache.drill.exec.rpc.user.UserServer.close(UserServer.java:283)
        at com.google.common.io.Closeables.close(Closeables.java:77)
        at com.google.common.io.Closeables.closeQuietly(Closeables.java:108)
        at 
org.apache.drill.exec.service.ServiceEngine.close(ServiceEngine.java:87)
        at com.google.common.io.Closeables.close(Closeables.java:77)
        at com.google.common.io.Closeables.closeQuietly(Closeables.java:108)
        at org.apache.drill.exec.server.Drillbit.close(Drillbit.java:288)
        at 
org.apache.drill.exec.physical.impl.TestBroadcastExchange.TestMultipleSendLocationBroadcastExchange(TestBroadcastExchange.java:88)

Running org.apache.drill.exec.physical.impl.partitionsender.TestPartitionSender
Running 
org.apache.drill.exec.physical.impl.partitionsender.TestPartitionSender#testPartitionSenderCostToThreads
Running 
org.apache.drill.exec.physical.impl.partitionsender.TestPartitionSender#testAlgorithm
ok      summary
true    planner.slice_target updated.
Total rows returned : 1.  Returned in 38ms.
Jul 10, 2015 12:47:20 AM org.apache.calcite.sql.validate.SqlValidatorException 
<init>
SEVERE: org.apache.calcite.sql.validate.SqlValidatorException: Table 
'dfs./home/dbarclay/work/git/incubator-drill/exec/java-exec/target/junit5218680774082947123/junit6147831434075535799'
 not found
Jul 10, 2015 12:47:20 AM org.apache.calcite.runtime.CalciteException <init>
SEVERE: org.apache.calcite.runtime.CalciteContextException: From line 1, column 
64 to line 1, column 66: Table 
'dfs./home/dbarclay/work/git/incubator-drill/exec/java-exec/target/junit5218680774082947123/junit6147831434075535799'
 not found
Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 63.904 sec <<< 
FAILURE! - in org.apache.drill.exec.physical.impl.partitionsender.TestPartitionSender
testPartitionSenderCostToThreads(org.apache.drill.exec.physical.impl.partitionsender.TestPartitionSender)
  Time elapsed: 50.023 sec  <<< ERROR!
java.lang.Exception: test timed out after 50000 milliseconds
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:503)
        at 
io.netty.util.concurrent.DefaultPromise.await(DefaultPromise.java:254)
        at io.netty.util.concurrent.DefaultPromise.await(DefaultPromise.java:32)
        at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:31)
        at org.apache.drill.exec.rpc.BasicServer.close(BasicServer.java:218)
        at org.apache.drill.exec.rpc.user.UserServer.close(UserServer.java:283)
        at com.google.common.io.Closeables.close(Closeables.java:77)
        at com.google.common.io.Closeables.closeQuietly(Closeables.java:108)
        at 
org.apache.drill.exec.service.ServiceEngine.close(ServiceEngine.java:87)
        at com.google.common.io.Closeables.close(Closeables.java:77)
        at com.google.common.io.Closeables.closeQuietly(Closeables.java:108)
        at org.apache.drill.exec.server.Drillbit.close(Drillbit.java:288)
        at org.apache.drill.BaseTestQuery.closeClient(BaseTestQuery.java:239)
        at 
org.apache.drill.BaseTestQuery.updateTestCluster(BaseTestQuery.java:126)
        at 
org.apache.drill.exec.physical.impl.partitionsender.TestPartitionSender.testPartitionSenderCostToThreads(TestPartitionSender.java:154)

Running org.apache.drill.exec.physical.impl.xsort.TestSimpleExternalSort
Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 0 sec - in 
org.apache.drill.exec.physical.impl.xsort.TestSimpleExternalSort
Running org.apache.drill.exec.physical.impl.svremover.TestSVRemover
Running 
org.apache.drill.exec.physical.impl.svremover.TestSVRemover#testSelectionVectorRemoval
blue    red     green
2147483647      9223372036854775807     2147483647
2147483647      9223372036854775807     2147483647

<TRIMMED>

2147483647      9223372036854775807     2147483647
Total rows returned : 50.  Returned in 181ms.
Running 
org.apache.drill.exec.physical.impl.svremover.TestSVRemover#testSVRWithNoFilter
blue    red     green
true    -9223372036854775808    -2147483648
false   9223372036854775807     null

<TRIMMED>

true    -9223372036854775808    -2147483648
false   9223372036854775807     null
Total rows returned : 100.  Returned in 54ms.
  C-c C-c
real    708m38.962s
user    11m3.332s
sys     1m8.068s
[130]dbarclay@dev-linux2 ~/work/git/incubator-drill $





Daniel
--
Daniel Barclay
MapR Technologies

Reply via email to