[jira] [Commented] (DRILL-7277) Bug in planner with redundant order-by

Ted Dunning (JIRA) Sat, 25 May 2019 00:01:10 -0700


    [ 
https://issues.apache.org/jira/browse/DRILL-7277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16848071#comment-16848071
 ]


Ted Dunning commented on DRILL-7277:
------------------------------------

This query:
{{select row_number() over (order by department_id desc) r, department_id
from (select department_id
         from  cp.`employee.json`
         order by department_id desc) ;}}

blows beets as below but putting department_id first in the output doesn't.

{{java.sql.SQLException: [MapR][DrillJDBCDriver](500165) Query execution error. 
Details: SYSTEM ERROR: CannotPlanException: Node 
[rel#26937:Subset#4.LOGICAL.ANY([]).[1 DESC]] could not be implemented; planner 
state:

Root: rel#26937:Subset#4.LOGICAL.ANY([]).[1 DESC]
Original rel:
LogicalProject(subset=[rel#26937:Subset#4.LOGICAL.ANY([]).[1 DESC]], r=[$1], 
department_id=[$0]): rowcount = 100.0, cumulative cost = {100.0 rows, 200.0 
cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 26935
  LogicalWindow(subset=[rel#26934:Subset#3.NONE.ANY([]).[1 DESC]], 
window#0=[window(partition {} order by [0 DESC] rows between UNBOUNDED 
PRECEDING and CURRENT ROW aggs [ROW_NUMBER()])]): rowcount = 100.0, cumulative 
cost = {100.0 rows, 200.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 26933
    LogicalSort(subset=[rel#26932:Subset#2.NONE.ANY([]).[0 DESC]], sort0=[$0], 
dir0=[DESC]): rowcount = 100.0, cumulative cost = {100.0 rows, 
1842.0680743952366 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 26931
      LogicalProject(subset=[rel#26930:Subset#1.NONE.ANY([]).[]], 
department_id=[$1]): rowcount = 100.0, cumulative cost = {100.0 rows, 100.0 
cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 26929
        EnumerableTableScan(subset=[rel#26928:Subset#0.ENUMERABLE.ANY([]).[]], 
table=[[cp, employee.json]]): rowcount = 100.0, cumulative cost = {100.0 rows, 
101.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 26880

Sets:
Set#0, type: RecordType(DYNAMIC_STAR **, ANY department_id)
        rel#26928:Subset#0.ENUMERABLE.ANY([]).[], best=rel#26880, 
importance=0.5904900000000001
                rel#26880:EnumerableTableScan.ENUMERABLE.ANY([]).[](table=[cp, 
employee.json]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 
io, 0.0 network, 0.0 memory}
        rel#26952:Subset#0.LOGICAL.ANY([]).[], best=rel#26954, 
importance=0.3247695
                rel#26954:DrillScanRel.LOGICAL.ANY([]).[](table=[cp, 
employee.json],groupscan=EasyGroupScan [selectionRoot=classpath:/employee.json, 
numFiles=1, columns=[`**`, `department_id`], 
files=[classpath:/employee.json]]), rowcount=463.0, cumulative cost={463.0 
rows, 4630000.0 cpu, 0.0 io, 0.0 network, 0.0 memory}
Set#1, type: RecordType(ANY department_id)
        rel#26930:Subset#1.NONE.ANY([]).[], best=null, importance=0.6561
                
rel#26929:LogicalProject.NONE.ANY([]).[](input=rel#26928:Subset#0.ENUMERABLE.ANY([]).[],department_id=$1),
 rowcount=100.0, cumulative cost={inf}
                rel#26931:LogicalSort.NONE.ANY([]).[0 
DESC](input=rel#26930:Subset#1.NONE.ANY([]).[],sort0=$0,dir0=DESC), 
rowcount=100.0, cumulative cost={inf}
        rel#26943:Subset#1.LOGICAL.ANY([]).[], best=rel#26950, importance=0.405
                rel#26944:DrillSortRel.LOGICAL.ANY([]).[0 
DESC](input=rel#26943:Subset#1.LOGICAL.ANY([]).[],sort0=$0,dir0=DESC), 
rowcount=463.0, cumulative cost={926.0 rows, 11830.070504167705 cpu, 0.0 io, 
0.0 network, 0.0 memory}
                rel#26950:DrillScanRel.LOGICAL.ANY([]).[](table=[cp, 
employee.json],groupscan=EasyGroupScan [selectionRoot=classpath:/employee.json, 
numFiles=1, columns=[`department_id`], files=[classpath:/employee.json]]), 
rowcount=463.0, cumulative cost={463.0 rows, 463.0 cpu, 0.0 io, 0.0 network, 
0.0 memory}
                
rel#26953:DrillProjectRel.LOGICAL.ANY([]).[](input=rel#26952:Subset#0.LOGICAL.ANY([]).[],department_id=$1),
 rowcount=463.0, cumulative cost={926.0 rows, 4630463.0 cpu, 0.0 io, 0.0 
network, 0.0 memory}
        rel#26946:Subset#1.NONE.ANY([]).[0 DESC], best=null, 
importance=0.7290000000000001
                rel#26931:LogicalSort.NONE.ANY([]).[0 
DESC](input=rel#26930:Subset#1.NONE.ANY([]).[],sort0=$0,dir0=DESC), 
rowcount=100.0, cumulative cost={inf}
        rel#26947:Subset#1.LOGICAL.ANY([]).[1 DESC], best=null, importance=0.81
        rel#26948:Subset#1.LOGICAL.ANY([]).[0 DESC], best=rel#26944, 
importance=0.405
                rel#26944:DrillSortRel.LOGICAL.ANY([]).[0 
DESC](input=rel#26943:Subset#1.LOGICAL.ANY([]).[],sort0=$0,dir0=DESC), 
rowcount=463.0, cumulative cost={926.0 rows, 11830.070504167705 cpu, 0.0 io, 
0.0 network, 0.0 memory}
Set#3, type: RecordType(ANY department_id, BIGINT w0$o0)
        rel#26934:Subset#3.NONE.ANY([]).[1 DESC], best=null, importance=0.81
                rel#26933:LogicalWindow.NONE.ANY([]).[[1 
DESC]](input=rel#26946:Subset#1.NONE.ANY([]).[0 DESC],window#0=window(partition 
{} order by [0 DESC] rows between UNBOUNDED PRECEDING and CURRENT ROW aggs 
[ROW_NUMBER()])), rowcount=100.0, cumulative cost={inf}
        rel#26939:Subset#3.LOGICAL.ANY([]).[1 DESC], best=null, importance=0.9
                rel#26942:DrillWindowRel.LOGICAL.ANY([]).[1 
DESC](input=rel#26947:Subset#1.LOGICAL.ANY([]).[1 
DESC],window#0=window(partition {} order by [0 DESC] rows between UNBOUNDED 
PRECEDING and CURRENT ROW aggs [ROW_NUMBER()])), rowcount=100.0, cumulative 
cost={inf}
Set#4, type: RecordType(BIGINT r, ANY department_id)
        rel#26936:Subset#4.NONE.ANY([]).[1 DESC], best=null, importance=0.9
                rel#26935:LogicalProject.NONE.ANY([]).[[1 
DESC]](input=rel#26934:Subset#3.NONE.ANY([]).[1 DESC],r=$1,department_id=$0), 
rowcount=100.0, cumulative cost={inf}
        rel#26937:Subset#4.LOGICAL.ANY([]).[1 DESC], best=null, importance=1.0
                rel#26938:AbstractConverter.LOGICAL.ANY([]).[1 
DESC](input=rel#26936:Subset#4.NONE.ANY([]).[1 
DESC],convention=LOGICAL,DrillDistributionTraitDef=ANY([]),sort=[1 DESC]), 
rowcount=100.0, cumulative cost={inf}
                rel#26940:DrillProjectRel.LOGICAL.ANY([]).[[1 
DESC]](input=rel#26939:Subset#3.LOGICAL.ANY([]).[1 
DESC],r=$1,department_id=$0), rowcount=100.0, cumulative cost={inf}




[Error Id: e9bec34a-b9b6-4541-bfb7-942bf01f67ee on 
mdn-1.mdn.tdunning-dsr-demo-z6io48.svc.cluster.local:31010].
        at 
com.mapr.drill.drill.dataengine.DRQryResultListener.checkAndThrowException(Unknown
 Source)
        at 
com.mapr.drill.drill.dataengine.DRQryResultListener.getNextBatch(Unknown Source)
        at 
com.mapr.drill.drill.dataengine.DRJDBCResultSet.doLoadRecordBatchData(Unknown 
Source)
        at 
com.mapr.drill.drill.dataengine.DRJDBCResultSet.doMoveToNextRow(Unknown Source)
        at com.mapr.drill.drill.dataengine.DRJDBCQueryExecutor.execute(Unknown 
Source)
        at com.mapr.drill.jdbc.common.SStatement.executeNoParams(Unknown Source)
        at com.mapr.drill.jdbc.common.SStatement.execute(Unknown Source)
        at 
org.apache.commons.dbcp2.DelegatingStatement.execute(DelegatingStatement.java:291)
        at 
org.apache.commons.dbcp2.DelegatingStatement.execute(DelegatingStatement.java:291)
        at 
org.apache.zeppelin.jdbc.JDBCInterpreter.executeSql(JDBCInterpreter.java:718)
        at 
org.apache.zeppelin.jdbc.JDBCInterpreter.interpret(JDBCInterpreter.java:801)
        at 
org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:103)
        at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:633)
        at org.apache.zeppelin.scheduler.Job.run(Job.java:188)
        at 
org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.run(ParallelScheduler.java:162)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
Caused by: com.mapr.drill.support.exceptions.GeneralException: 
[MapR][DrillJDBCDriver](500165) Query execution error. Details: SYSTEM ERROR: 
CannotPlanException: Node [rel#26937:Subset#4.LOGICAL.ANY([]).[1 DESC]] could 
not be implemented; planner state:

Root: rel#26937:Subset#4.LOGICAL.ANY([]).[1 DESC]
Original rel:
LogicalProject(subset=[rel#26937:Subset#4.LOGICAL.ANY([]).[1 DESC]], r=[$1], 
department_id=[$0]): rowcount = 100.0, cumulative cost = {100.0 rows, 200.0 
cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 26935
  LogicalWindow(subset=[rel#26934:Subset#3.NONE.ANY([]).[1 DESC]], 
window#0=[window(partition {} order by [0 DESC] rows between UNBOUNDED 
PRECEDING and CURRENT ROW aggs [ROW_NUMBER()])]): rowcount = 100.0, cumulative 
cost = {100.0 rows, 200.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 26933
    LogicalSort(subset=[rel#26932:Subset#2.NONE.ANY([]).[0 DESC]], sort0=[$0], 
dir0=[DESC]): rowcount = 100.0, cumulative cost = {100.0 rows, 
1842.0680743952366 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 26931
      LogicalProject(subset=[rel#26930:Subset#1.NONE.ANY([]).[]], 
department_id=[$1]): rowcount = 100.0, cumulative cost = {100.0 rows, 100.0 
cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 26929
        EnumerableTableScan(subset=[rel#26928:Subset#0.ENUMERABLE.ANY([]).[]], 
table=[[cp, employee.json]]): rowcount = 100.0, cumulative cost = {100.0 rows, 
101.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 26880

Sets:
Set#0, type: RecordType(DYNAMIC_STAR **, ANY department_id)
        rel#26928:Subset#0.ENUMERABLE.ANY([]).[], best=rel#26880, 
importance=0.5904900000000001
                rel#26880:EnumerableTableScan.ENUMERABLE.ANY([]).[](table=[cp, 
employee.json]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 
io, 0.0 network, 0.0 memory}
        rel#26952:Subset#0.LOGICAL.ANY([]).[], best=rel#26954, 
importance=0.3247695
                rel#26954:DrillScanRel.LOGICAL.ANY([]).[](table=[cp, 
employee.json],groupscan=EasyGroupScan [selectionRoot=classpath:/employee.json, 
numFiles=1, columns=[`**`, `department_id`], 
files=[classpath:/employee.json]]), rowcount=463.0, cumulative cost={463.0 
rows, 4630000.0 cpu, 0.0 io, 0.0 network, 0.0 memory}
Set#1, type: RecordType(ANY department_id)
        rel#26930:Subset#1.NONE.ANY([]).[], best=null, importance=0.6561
                
rel#26929:LogicalProject.NONE.ANY([]).[](input=rel#26928:Subset#0.ENUMERABLE.ANY([]).[],department_id=$1),
 rowcount=100.0, cumulative cost={inf}
                rel#26931:LogicalSort.NONE.ANY([]).[0 
DESC](input=rel#26930:Subset#1.NONE.ANY([]).[],sort0=$0,dir0=DESC), 
rowcount=100.0, cumulative cost={inf}
        rel#26943:Subset#1.LOGICAL.ANY([]).[], best=rel#26950, importance=0.405
                rel#26944:DrillSortRel.LOGICAL.ANY([]).[0 
DESC](input=rel#26943:Subset#1.LOGICAL.ANY([]).[],sort0=$0,dir0=DESC), 
rowcount=463.0, cumulative cost={926.0 rows, 11830.070504167705 cpu, 0.0 io, 
0.0 network, 0.0 memory}
                rel#26950:DrillScanRel.LOGICAL.ANY([]).[](table=[cp, 
employee.json],groupscan=EasyGroupScan [selectionRoot=classpath:/employee.json, 
numFiles=1, columns=[`department_id`], files=[classpath:/employee.json]]), 
rowcount=463.0, cumulative cost={463.0 rows, 463.0 cpu, 0.0 io, 0.0 network, 
0.0 memory}
                
rel#26953:DrillProjectRel.LOGICAL.ANY([]).[](input=rel#26952:Subset#0.LOGICAL.ANY([]).[],department_id=$1),
 rowcount=463.0, cumulative cost={926.0 rows, 4630463.0 cpu, 0.0 io, 0.0 
network, 0.0 memory}
        rel#26946:Subset#1.NONE.ANY([]).[0 DESC], best=null, 
importance=0.7290000000000001
                rel#26931:LogicalSort.NONE.ANY([]).[0 
DESC](input=rel#26930:Subset#1.NONE.ANY([]).[],sort0=$0,dir0=DESC), 
rowcount=100.0, cumulative cost={inf}
        rel#26947:Subset#1.LOGICAL.ANY([]).[1 DESC], best=null, importance=0.81
        rel#26948:Subset#1.LOGICAL.ANY([]).[0 DESC], best=rel#26944, 
importance=0.405
                rel#26944:DrillSortRel.LOGICAL.ANY([]).[0 
DESC](input=rel#26943:Subset#1.LOGICAL.ANY([]).[],sort0=$0,dir0=DESC), 
rowcount=463.0, cumulative cost={926.0 rows, 11830.070504167705 cpu, 0.0 io, 
0.0 network, 0.0 memory}
Set#3, type: RecordType(ANY department_id, BIGINT w0$o0)
        rel#26934:Subset#3.NONE.ANY([]).[1 DESC], best=null, importance=0.81
                rel#26933:LogicalWindow.NONE.ANY([]).[[1 
DESC]](input=rel#26946:Subset#1.NONE.ANY([]).[0 DESC],window#0=window(partition 
{} order by [0 DESC] rows between UNBOUNDED PRECEDING and CURRENT ROW aggs 
[ROW_NUMBER()])), rowcount=100.0, cumulative cost={inf}
        rel#26939:Subset#3.LOGICAL.ANY([]).[1 DESC], best=null, importance=0.9
                rel#26942:DrillWindowRel.LOGICAL.ANY([]).[1 
DESC](input=rel#26947:Subset#1.LOGICAL.ANY([]).[1 
DESC],window#0=window(partition {} order by [0 DESC] rows between UNBOUNDED 
PRECEDING and CURRENT ROW aggs [ROW_NUMBER()])), rowcount=100.0, cumulative 
cost={inf}
Set#4, type: RecordType(BIGINT r, ANY department_id)
        rel#26936:Subset#4.NONE.ANY([]).[1 DESC], best=null, importance=0.9
                rel#26935:LogicalProject.NONE.ANY([]).[[1 
DESC]](input=rel#26934:Subset#3.NONE.ANY([]).[1 DESC],r=$1,department_id=$0), 
rowcount=100.0, cumulative cost={inf}
        rel#26937:Subset#4.LOGICAL.ANY([]).[1 DESC], best=null, importance=1.0
                rel#26938:AbstractConverter.LOGICAL.ANY([]).[1 
DESC](input=rel#26936:Subset#4.NONE.ANY([]).[1 
DESC],convention=LOGICAL,DrillDistributionTraitDef=ANY([]),sort=[1 DESC]), 
rowcount=100.0, cumulative cost={inf}
                rel#26940:DrillProjectRel.LOGICAL.ANY([]).[[1 
DESC]](input=rel#26939:Subset#3.LOGICAL.ANY([]).[1 
DESC],r=$1,department_id=$0), rowcount=100.0, cumulative cost={inf}




[Error Id: e9bec34a-b9b6-4541-bfb7-942bf01f67ee on 
mdn-1.mdn.tdunning-dsr-demo-z6io48.svc.cluster.local:31010].
        ... 21 more}}

> Bug in planner with redundant order-by
> --------------------------------------
>
>                 Key: DRILL-7277
>                 URL: https://issues.apache.org/jira/browse/DRILL-7277
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.14.0
>            Reporter: Ted Dunning
>            Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (DRILL-7277) Bug in planner with redundant order-by

Reply via email to