[jira] [Updated] (DRILL-5840) A query that includes sort completes, and then loses Drill connection. Drill becomes unresponsive, and cannot restart because it cannot communicate with Zookeeper

2017-10-04 Thread Robert Hou (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Hou updated DRILL-5840:
--
Description: 
Query is:
{noformat}
ALTER SESSION SET `exec.sort.disable_managed` = false;
select count(*) from (select * from 
dfs.`/drill/testdata/resource-manager/250wide.tbl` order by columns[0])d where 
d.columns[0] = 'ljdfhwuehnoiueyf';
{noformat}

Query tries to complete, but cannot.  It takes 20 hours from the time the query 
tries to complete, to the time Drill finally loses its connection.

>From the drillbit.log:
{noformat}
2017-10-03 16:28:14,892 [262bec7f-3539-0dd7-6fea-f2959f9df3b6:frag:0:0] DEBUG 
o.a.drill.exec.work.foreman.Foreman - 262bec7f-3539-0dd7-6fea-f2959f9df3b6: 
State change requested RUNNING --> COMPLETED
2017-10-04 01:47:27,698 [UserServer-1] DEBUG 
o.a.d.e.r.u.UserServerRequestHandler - Received query to run.  Returning query 
handle.
2017-10-04 03:30:02,916 [262bec7f-3539-0dd7-6fea-f2959f9df3b6:frag:0:0] WARN  
o.a.d.exec.work.foreman.QueryManager - Failure while trying to delete the 
estore profile for this query.
org.apache.drill.common.exceptions.DrillRuntimeException: unable to delete node 
at /running/262bec7f-3539-0dd7-6fea-f2959f9df3b6
at 
org.apache.drill.exec.coord.zk.ZookeeperClient.delete(ZookeeperClient.java:343) 
~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.coord.zk.ZkEphemeralStore.remove(ZkEphemeralStore.java:108)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.QueryManager.updateEphemeralState(QueryManager.java:293)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman.recordNewState(Foreman.java:1043) 
[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:964) 
[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman.access$2600(Foreman.java:113) 
[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:1025)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:1018)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.common.EventProcessor.processEvents(EventProcessor.java:107) 
[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.common.EventProcessor.sendEvent(EventProcessor.java:65) 
[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman$StateSwitch.addEvent(Foreman.java:1020)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman.addToEventQueue(Foreman.java:1038) 
[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.QueryManager.nodeComplete(QueryManager.java:498)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.QueryManager.access$100(QueryManager.java:66)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.QueryManager$NodeTracker.fragmentComplete(QueryManager.java:462)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.QueryManager.fragmentDone(QueryManager.java:147)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.QueryManager.access$400(QueryManager.java:66)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.QueryManager$1.statusUpdate(QueryManager.java:525)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.rpc.control.WorkEventBus.statusUpdate(WorkEventBus.java:71)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentStatusReporter.sendStatus(FragmentStatusReporter.java:124)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentStatusReporter.stateChanged(FragmentStatusReporter.java:94)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:304)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:267)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 

[jira] [Updated] (DRILL-5840) A query that includes sort completes, and then loses Drill connection. Drill becomes unresponsive, and cannot restart because it cannot communicate with Zookeeper

2017-10-04 Thread Robert Hou (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Hou updated DRILL-5840:
--
Description: 
Query is:
{noformat}
ALTER SESSION SET `exec.sort.disable_managed` = false;
select count(*) from (select * from 
dfs.`/drill/testdata/resource-manager/250wide.tbl` order by columns[0])d where 
d.columns[0] = 'ljdfhwuehnoiueyf';
{noformat}

Query tries to complete, but cannot.

>From the drillbit.log:
{noformat}
2017-10-03 16:28:14,892 [262bec7f-3539-0dd7-6fea-f2959f9df3b6:frag:0:0] DEBUG 
o.a.drill.exec.work.foreman.Foreman - 262bec7f-3539-0dd7-6fea-f2959f9df3b6: 
State change requested RUNNING --> COMPLETED
2017-10-04 01:47:27,698 [UserServer-1] DEBUG 
o.a.d.e.r.u.UserServerRequestHandler - Received query to run.  Returning query 
handle.
2017-10-04 03:30:02,916 [262bec7f-3539-0dd7-6fea-f2959f9df3b6:frag:0:0] WARN  
o.a.d.exec.work.foreman.QueryManager - Failure while trying to delete the 
estore profile for this query.
org.apache.drill.common.exceptions.DrillRuntimeException: unable to delete node 
at /running/262bec7f-3539-0dd7-6fea-f2959f9df3b6
at 
org.apache.drill.exec.coord.zk.ZookeeperClient.delete(ZookeeperClient.java:343) 
~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.coord.zk.ZkEphemeralStore.remove(ZkEphemeralStore.java:108)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.QueryManager.updateEphemeralState(QueryManager.java:293)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman.recordNewState(Foreman.java:1043) 
[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:964) 
[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman.access$2600(Foreman.java:113) 
[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:1025)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:1018)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.common.EventProcessor.processEvents(EventProcessor.java:107) 
[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.common.EventProcessor.sendEvent(EventProcessor.java:65) 
[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman$StateSwitch.addEvent(Foreman.java:1020)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman.addToEventQueue(Foreman.java:1038) 
[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.QueryManager.nodeComplete(QueryManager.java:498)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.QueryManager.access$100(QueryManager.java:66)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.QueryManager$NodeTracker.fragmentComplete(QueryManager.java:462)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.QueryManager.fragmentDone(QueryManager.java:147)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.QueryManager.access$400(QueryManager.java:66)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.QueryManager$1.statusUpdate(QueryManager.java:525)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.rpc.control.WorkEventBus.statusUpdate(WorkEventBus.java:71)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentStatusReporter.sendStatus(FragmentStatusReporter.java:124)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentStatusReporter.stateChanged(FragmentStatusReporter.java:94)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:304)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:267)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) 
[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at