[ 
https://issues.apache.org/jira/browse/DRILL-1327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14107676#comment-14107676
 ] 

Hao Zhu commented on DRILL-1327:
--------------------------------

Please check attached drillbit.log for details on where/why the SQL hung.

Here are my findings based on the logs:
1. The previous SQL was cancelled by me:
{code}
2014-08-22 15:58:42,656 [9ee42fbd-316d-4f5e-b808-e57b3b3f35a4:frag:0:0] ERROR 
o.a.d.e.p.i.ScreenCreator$ScreenRoot - Failure while sending data to user.
org.apache.drill.exec.rpc.RpcException: java.lang.InterruptedException
        at 
org.apache.drill.exec.rpc.RemoteConnection.blockOnNotWritable(RemoteConnection.java:56)
 
[drill-java-exec-0.4.0-incubating-SNAPSHOT-rebuffed.jar:0.4.0-incubating-SNAPSHOT]
        at org.apache.drill.exec.rpc.RpcBus.send(RpcBus.java:91) 
[drill-java-exec-0.4.0-incubating-SNAPSHOT-rebuffed.jar:0.4.0-incubating-SNAPSHOT]
        at 
org.apache.drill.exec.rpc.user.UserServer$UserClientConnection.sendResult(UserServer.java:133)
 
[drill-java-exec-0.4.0-incubating-SNAPSHOT-rebuffed.jar:0.4.0-incubating-SNAPSHOT]
        at 
org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:159)
 
[drill-java-exec-0.4.0-incubating-SNAPSHOT-rebuffed.jar:0.4.0-incubating-SNAPSHOT]
        at 
org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:49) 
[drill-java-exec-0.4.0-incubating-SNAPSHOT-rebuffed.jar:0.4.0-incubating-SNAPSHOT]
        at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:116)
 
[drill-java-exec-0.4.0-incubating-SNAPSHOT-rebuffed.jar:0.4.0-incubating-SNAPSHOT]
        at 
org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:250)
 
[drill-java-exec-0.4.0-incubating-SNAPSHOT-rebuffed.jar:0.4.0-incubating-SNAPSHOT]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_67]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_67]
        at java.lang.Thread.run(Thread.java:745) [na:1.7.0_67]
Caused by: java.lang.InterruptedException: null
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:996)
 ~[na:1.7.0_67]
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1303)
 ~[na:1.7.0_67]
        at 
org.apache.drill.exec.rpc.ResettableBarrier.await(ResettableBarrier.java:67) 
~[drill-java-exec-0.4.0-incubating-SNAPSHOT-rebuffed.jar:0.4.0-incubating-SNAPSHOT]
        at 
org.apache.drill.exec.rpc.RemoteConnection$WriteManager.waitForWritable(RemoteConnection.java:80)
 
~[drill-java-exec-0.4.0-incubating-SNAPSHOT-rebuffed.jar:0.4.0-incubating-SNAPSHOT]
        at 
org.apache.drill.exec.rpc.RemoteConnection.blockOnNotWritable(RemoteConnection.java:53)
 
[drill-java-exec-0.4.0-incubating-SNAPSHOT-rebuffed.jar:0.4.0-incubating-SNAPSHOT]
        ... 9 common frames omitted
{code}

2. New query come in:
{code}
2014-08-22 15:58:49,084 [UserServer-1] DEBUG o.a.drill.exec.rpc.user.UserServer 
- Received query to run.  Returning query handle.
2014-08-22 15:58:49,088 [WorkManager Event Thread] DEBUG 
o.apache.drill.exec.work.WorkManager - Starting pending task 
org.apache.drill.exec.work.WorkManager$RunnableWrapper@61e0411a
2014-08-22 15:58:49,092 [689f7a9a-8d5e-40a7-bd97-1ced039cab0f:foreman] DEBUG 
o.a.d.e.s.h.HBaseStoragePluginConfig - Initializing HBase StoragePlugin 
configuration with zookeeper quorum 'localhost', port '2181'.
2014-08-22 15:58:49,093 [689f7a9a-8d5e-40a7-bd97-1ced039cab0f:foreman] DEBUG 
o.a.drill.exec.store.SchemaFactory - Took 4 ms to register schemas.
2014-08-22 15:58:49,101 [689f7a9a-8d5e-40a7-bd97-1ced039cab0f:foreman] DEBUG 
o.a.d.e.p.s.h.DefaultSqlHandler - Optiq Logical :
SortRel(sort0=[$1], dir0=[DESC]): rowcount = 100.0, cumulative cost = 
{5726.20422318571 rows, 501.0 cpu, 0.0 io, 0.0 network}, id = 5678
  ProjectRel(TABLE_SCHEMA=[$1], TABLE_NAME=[$2], TABLE_TYPE=[$3]): rowcount = 
100.0, cumulative cost = {200.0 rows, 401.0 cpu, 0.0 io, 0.0 network}, id = 5677
    EnumerableTableAccessRel(table=[[INFORMATION_SCHEMA, TABLES]]): rowcount = 
100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io, 0.0 network}, id = 5676

2014-08-22 15:58:49,109 [689f7a9a-8d5e-40a7-bd97-1ced039cab0f:foreman] DEBUG 
o.a.d.e.p.s.h.DefaultSqlHandler - Drill Logical :
DrillScreenRel: rowcount = 20.0, cumulative cost = {760.9757456529578 rows, 
114.0 cpu, 0.0 io, 0.0 network}, id = 5712
  DrillSortRel(sort0=[$1], dir0=[DESC]): rowcount = 20.0, cumulative cost = 
{758.9757456529578 rows, 112.0 cpu, 0.0 io, 0.0 network}, id = 5711
    DrillProjectRel(TABLE_SCHEMA=[$1], TABLE_NAME=[$2], TABLE_TYPE=[$3]): 
rowcount = 20.0, cumulative cost = {40.0 rows, 92.0 cpu, 0.0 io, 0.0 network}, 
id = 5710
      DrillScanRel(table=[[INFORMATION_SCHEMA, TABLES]], 
groupscan=[TABLEScolumns=null]): rowcount = 20.0, cumulative cost = {20.0 rows, 
80.0 cpu, 0.0 io, 0.0 network}, id = 5709

2014-08-22 15:58:49,129 [689f7a9a-8d5e-40a7-bd97-1ced039cab0f:foreman] DEBUG 
o.a.d.e.p.s.h.DefaultSqlHandler - Drill Physical :
00-00    Screen: rowcount = 20.0, cumulative cost = {102.0 rows, 
471.754247590989 cpu, 0.0 io, 0.0 network}, id = 5818
00-01      Project(TABLE_SCHEMA=[$0], TABLE_NAME=[$1], TABLE_TYPE=[$2]): 
rowcount = 20.0, cumulative cost = {100.0 rows, 469.754247590989 cpu, 0.0 io, 
0.0 network}, id = 5817
00-02        SelectionVectorRemover: rowcount = 20.0, cumulative cost = {80.0 
rows, 457.754247590989 cpu, 0.0 io, 0.0 network}, id = 5816
00-03          Sort(sort0=[$1], dir0=[DESC]): rowcount = 20.0, cumulative cost 
= {60.0 rows, 437.754247590989 cpu, 0.0 io, 0.0 network}, id = 5815
00-04            Project(TABLE_SCHEMA=[$1], TABLE_NAME=[$2], TABLE_TYPE=[$3]): 
rowcount = 20.0, cumulative cost = {40.0 rows, 92.0 cpu, 0.0 io, 0.0 network}, 
id = 5814
00-05              Scan(groupscan=[TABLEScolumns=null]): rowcount = 20.0, 
cumulative cost = {20.0 rows, 80.0 cpu, 0.0 io, 0.0 network}, id = 5813

2014-08-22 15:58:49,131 [689f7a9a-8d5e-40a7-bd97-1ced039cab0f:foreman] DEBUG 
o.a.d.e.p.s.h.DefaultSqlHandler - Drill Plan :
{
  "head" : {
    "version" : 1,
    "generator" : {
      "type" : "DefaultSqlHandler",
      "info" : ""
    },
    "type" : "APACHE_DRILL_PHYSICAL",
    "options" : [ ],
    "queue" : 0,
    "resultMode" : "EXEC"
  },
  "graph" : [ {
    "pop" : "info-schema",
    "@id" : 5,
    "cost" : 20.0
  }, {
    "pop" : "project",
    "@id" : 4,
    "exprs" : [ {
      "ref" : "`TABLE_SCHEMA`",
      "expr" : "`TABLE_SCHEMA`"
    }, {
      "ref" : "`TABLE_NAME`",
      "expr" : "`TABLE_NAME`"
    }, {
      "ref" : "`TABLE_TYPE`",
      "expr" : "`TABLE_TYPE`"
    } ],
    "child" : 5,
    "initialAllocation" : 1000000,
    "maxAllocation" : 10000000000,
    "cost" : 20.0
  }, {
    "pop" : "external-sort",
    "@id" : 3,
    "child" : 4,
    "orderings" : [ {
      "order" : "DESC",
      "expr" : "`TABLE_NAME`",
      "nullDirection" : "UNSPECIFIED"
    } ],
    "reverse" : false,
    "initialAllocation" : 1000000,
    "maxAllocation" : 10000000000,
    "cost" : 20.0
  }, {
    "pop" : "selection-vector-remover",
    "@id" : 2,
    "child" : 3,
    "initialAllocation" : 1000000,
    "maxAllocation" : 10000000000,
    "cost" : 20.0
  }, {
    "pop" : "project",
    "@id" : 1,
    "exprs" : [ {
      "ref" : "`TABLE_SCHEMA`",
      "expr" : "`TABLE_SCHEMA`"
    }, {
      "ref" : "`TABLE_NAME`",
      "expr" : "`TABLE_NAME`"
    }, {
      "ref" : "`TABLE_TYPE`",
      "expr" : "`TABLE_TYPE`"
    } ],
    "child" : 2,
    "initialAllocation" : 1000000,
    "maxAllocation" : 10000000000,
    "cost" : 20.0
  }, {
    "pop" : "screen",
    "@id" : 0,
    "child" : 1,
    "initialAllocation" : 1000000,
    "maxAllocation" : 10000000000,
    "cost" : 20.0
  } ]
}
{code}

3. The new query finished compiling:
{code}
2014-08-22 15:58:49,257 [689f7a9a-8d5e-40a7-bd97-1ced039cab0f:frag:0:0] DEBUG 
o.a.d.exec.compile.ClassTransformer - Done compiling (bytecode size=906 B, 
time:3 millis).
{code}

After above line, no more logs in drillbit.log.

Anyway, It is easy to reproduce every time.
Hopefully you can get the root cause.


> Any subsequent query may hung in sqlline after you cancel one query in the 
> same session
> ---------------------------------------------------------------------------------------
>
>                 Key: DRILL-1327
>                 URL: https://issues.apache.org/jira/browse/DRILL-1327
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>    Affects Versions: 0.4.0
>         Environment: MapR 3.1.1
> Drill 0.4.0
>            Reporter: Hao Zhu
>            Priority: Critical
>         Attachments: drillbit.log
>
>
> This is another issue related to DRILL-1326.
> If we cancel a query in sqlline, subsequent query may hung.
> I can reproduce it almost every time.
> How to reproduce:
> 1. Logon sqlline, run below SQL:
> {code}
> select *
>  from cp.`employee.json` a, cp.`employee.json` b
>  where a.position_id=b.position_id ;
> {code}
> 2. When it is printing the results for about 3 seconds, type "ctrl-c" to stop 
> it.
> 3. Then any subsequent query will hung.
> ===============================
> For example,
> {code}
> | 110         | Josie Underwood | Josie      | Underwood  | 15          | 
> Store Permanent Checker | 7          | 15            | 1914-02-02 | 
> 1996-01-01 00:00:00.0 | 8200.0     | 101       |
> | 110         | Josie Underwood | Josie      | Underwood  | 15          | 
> Store Permanent Checker | 7          | 15            | 1914-02-02 | 
> 1996-01-01 00:00:00.0 | 8200.0     | 101       |
> | 110         | Josie Underwood | Josie      | Underwood  | 15          | 
> Store Permanent Checker | 7          | 15            | 1914-02-02 | 
> 1996-01-01 00:00:00.0 | 8200.0     | 101       |
> | 110         | Josie Underwood | Josie      | Underwood  | 15          | 
> Store Permanent Checker | 7          | 15            | 1914-02-02 | 
> 1996-01-01 00:00:00.0 | 8200.0     | 101       |
> +-------------+------------+------------+------------+-------------+----------------+------------+---------------+------------+------------+------------+---------------+-----------------+--+
> 11,744 rows selected (2.879 seconds)
> 0: jdbc:drill:zk=node1.xxx.com:5181,node3.>
> Command canceled.
> 0: jdbc:drill:zk=node1.xxx.com:5181,node3.>
> Command canceled.
> 0: jdbc:drill:zk=node1.xxx.com:5181,node3.>
> 0: jdbc:drill:zk=node1.xxx.com:5181,node3.> SELECT TABLE_SCHEMA, TABLE_NAME, 
> TABLE_TYPE
> . . . . . . . . . . . . . . . . . . . . . . .> FROM 
> INFORMATION_SCHEMA.`TABLES`
> . . . . . . . . . . . . . . . . . . . . . . .> ORDER BY TABLE_NAME DESC;
> {code}
> In the end, we have "ctrl-z" and "kill -9 %1" to exit the sqlline.
> After that, the SQL become "Running Queries" forever.
> We need to fix this because it looks like the sqlline is not stable.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to