Chun Chang created DRILL-1804:
---------------------------------
Summary: random failures while running large number of queries
Key: DRILL-1804
URL: https://issues.apache.org/jira/browse/DRILL-1804
Project: Apache Drill
Issue Type: Bug
Components: Query Planning & Optimization
Affects Versions: 0.7.0
Reporter: Chun Chang
#Tue Dec 02 14:38:34 EST 2014
git.commit.id.abbrev=757e9a2
Running Mondrian regression tests, out of over 6000 queries, sometimes I get
one or two random failures. Here is the stack when it happens:
2014-12-02 17:49:32,271 [2b8193d3-f0ca-aa7c-094a-d8234d76d068:foreman] ERROR
o.a.drill.exec.work.foreman.Foreman - Error
aeae057b-ed0a-43aa-902d-fe3a41531511: Query failed: Unexpected exception during
fragment initialization.
org.apache.drill.exec.work.foreman.ForemanException: Unexpected exception
during fragment initialization.
at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:194)
[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
at
org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:254)
[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
[na:1.7.0_45]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[na:1.7.0_45]
at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
Caused by: java.lang.RuntimeException: Failure while accessing Zookeeper.
Failure while accessing Zookeeper
at
org.apache.drill.exec.store.sys.zk.ZkAbstractStore.put(ZkAbstractStore.java:111)
~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
at
org.apache.drill.exec.work.foreman.QueryStatus.updateQueryStateInStore(QueryStatus.java:132)
~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
at
org.apache.drill.exec.work.foreman.Foreman.recordNewState(Foreman.java:502)
[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
at org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:396)
[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
at
org.apache.drill.exec.work.foreman.Foreman.runPhysicalPlan(Foreman.java:311)
[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:510)
[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:185)
[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
... 4 common frames omitted
Caused by: java.lang.RuntimeException: Failure while accessing Zookeeper
at
org.apache.drill.exec.store.sys.zk.ZkEStore.createNodeInZK(ZkEStore.java:53)
~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
at
org.apache.drill.exec.store.sys.zk.ZkAbstractStore.put(ZkAbstractStore.java:106)
~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
... 10 common frames omitted
Caused by: org.apache.zookeeper.KeeperException$NodeExistsException:
KeeperErrorCode = NodeExists for
/drill/running/2b8193d3-f0ca-aa7c-094a-d8234d76d068
at org.apache.zookeeper.KeeperException.create(KeeperException.java:119)
~[zookeeper-3.4.5-mapr-1406.jar:3.4.5-mapr-1406--1]
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
~[zookeeper-3.4.5-mapr-1406.jar:3.4.5-mapr-1406--1]
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
~[zookeeper-3.4.5-mapr-1406.jar:3.4.5-mapr-1406--1]
at
org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:676)
~[curator-framework-2.5.0.jar:na]
at
org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:660)
~[curator-framework-2.5.0.jar:na]
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107)
~[curator-client-2.5.0.jar:na]
at
org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:656)
~[curator-framework-2.5.0.jar:na]
at
org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:441)
~[curator-framework-2.5.0.jar:na]
at
org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:431)
~[curator-framework-2.5.0.jar:na]
at
org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:44)
~[curator-framework-2.5.0.jar:na]
at
org.apache.drill.exec.store.sys.zk.ZkEStore.createNodeInZK(ZkEStore.java:51)
~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
... 11 common frames omitted
2014-12-02 17:49:32,287 [2b8193d3-f0ca-aa7c-094a-d8234d76d068:frag:0:0] WARN
o.a.d.e.p.impl.SendingAccountor - Failure while waiting for send complete.
java.lang.InterruptedException: null
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1301)
~[na:1.7.0_45]
at java.util.concurrent.Semaphore.acquire(Semaphore.java:472) ~[na:1.7.0_45]
at
org.apache.drill.exec.physical.impl.SendingAccountor.waitForSendComplete(SendingAccountor.java:44)
~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)