[jira] [Commented] (BEAM-3481) Query with subquery and aggregates cannot be implemented.

Anton Kedin (JIRA) Fri, 02 Feb 2018 16:00:23 -0800

    [ 
https://issues.apache.org/jira/browse/BEAM-3481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351080#comment-16351080
 ]


Anton Kedin commented on BEAM-3481:
-----------------------------------

[~AustinHaas], is it a full stack trace? Is it the same in the second case? Can 
you post your pipeline configuration?

I am trying to reproduce on the latest master, and this succeeds:
{code:sql|title=SQL}
"SELECT COUNT(`?p`) as `n`, 500 as `f` FROM " +
            "(SELECT p as `?p`, 500 FROM `contains` GROUP BY p) as t1";
{code}
{code:sql|title=Plan}
Feb 02, 2018 3:53:02 PM 
org.apache.beam.sdk.extensions.sql.impl.planner.BeamQueryPlanner 
validateAndConvert
INFO: SQL:
SELECT COUNT(`t1`.`?p`) AS `n`, 500 AS `f`
FROM (SELECT `contains`.`p` AS `?p`, 500
FROM `contains` AS `contains`
GROUP BY `contains`.`p`) AS `t1`
Feb 02, 2018 3:53:02 PM 
org.apache.beam.sdk.extensions.sql.impl.planner.BeamQueryPlanner 
convertToBeamRel
INFO: SQLPlan>
LogicalProject(n=[$0], f=[500])
  LogicalAggregate(group=[{}], n=[COUNT()])
    LogicalProject(?p=[$0])
      LogicalProject(?p=[$0], EXPR$1=[500])
        LogicalAggregate(group=[{0}])
          LogicalProject(?p=[$1])
            LogicalTableScan(table=[[contains]])
{code}

> Query with subquery and aggregates cannot be implemented.
> ---------------------------------------------------------
>
>                 Key: BEAM-3481
>                 URL: https://issues.apache.org/jira/browse/BEAM-3481
>             Project: Beam
>          Issue Type: Bug
>          Components: dsl-sql
>    Affects Versions: 2.2.0
>            Reporter: Austin Haas
>            Priority: Major
>
> This query results in the error below:
> {noformat}
> "SELECT (COUNT(`p`))
>  FROM (SELECT `p`
>        FROM `contains`
>        GROUP BY `p`) AS `t1`"{noformat}
> This works correctly:
> {noformat}
> "SELECT (COUNT(`p`))
>  FROM (SELECT `p`, CURRENT_TIME
>        FROM `contains`
>        GROUP BY `p`) AS `t1`"{noformat}
> Error:
>  
> {noformat}
> [nREPL-worker-5] INFO 
> org.apache.beam.sdk.extensions.sql.impl.planner.BeamQueryPlanner - SQL:
> SELECT COUNT(`t1`.`p`)
> FROM (SELECT `contains`.`p`
> FROM `contains` AS `contains`
> GROUP BY `contains`.`p`) AS `t1`
> [nREPL-worker-5] INFO 
> org.apache.beam.sdk.extensions.sql.impl.planner.BeamQueryPlanner - SQLPlan>
> LogicalAggregate(group=[{}], EXPR$0=[COUNT()])
>  LogicalAggregate(group=[{0}])
>  LogicalProject(p=[$0])
>  LogicalTableScan(table=[[contains]])
> CannotPlanException Node [rel#157:Subset#3.BEAM_LOGICAL.[]] could not be 
> implemented; planner state:
> Root: rel#157:Subset#3.BEAM_LOGICAL.[]
> Original rel:
> LogicalAggregate(subset=[rel#157:Subset#3.BEAM_LOGICAL.[]], group=[{}], 
> EXPR$0=[COUNT()]): rowcount = 1.0, cumulative cost = {1.125 rows, 0.0 cpu, 
> 0.0 io}, id = 155
>  LogicalAggregate(subset=[rel#154:Subset#2.NONE.[]], group=[{0}]): rowcount = 
> 10.0, cumulative cost = {10.0 rows, 0.0 cpu, 0.0 io}, id = 153
>  LogicalProject(subset=[rel#152:Subset#1.NONE.[]], p=[$0]): rowcount = 100.0, 
> cumulative cost = {100.0 rows, 100.0 cpu, 0.0 io}, id = 151
>  LogicalTableScan(subset=[rel#150:Subset#0.NONE.[]], table=[[contains]]): 
> rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io}, id = 146
> Sets:
> Set#0, type: RecordType(VARCHAR p, VARCHAR s, BIGINT c)
>  rel#150:Subset#0.NONE.[], best=null, importance=0.6561
>  rel#146:LogicalTableScan.NONE.[](table=[contains]), rowcount=100.0, 
> cumulative cost={inf}
>  rel#162:Subset#0.BEAM_LOGICAL.[], best=rel#164, importance=0.32805
>  rel#164:BeamIOSourceRel.BEAM_LOGICAL.[](table=[contains]), rowcount=100.0, 
> cumulative cost={100.0 rows, 101.0 cpu, 0.0 io}
> Set#1, type: RecordType(VARCHAR p)
>  rel#152:Subset#1.NONE.[], best=null, importance=0.7290000000000001
>  rel#151:LogicalProject.NONE.[](input=rel#150:Subset#0.NONE.[],p=$0), 
> rowcount=100.0, cumulative cost={inf}
>  rel#159:Subset#1.BEAM_LOGICAL.[], best=rel#163, 
> importance=0.36450000000000005
>  
> rel#163:BeamProjectRel.BEAM_LOGICAL.[](input=rel#162:Subset#0.BEAM_LOGICAL.[],p=$0),
>  rowcount=100.0, cumulative cost={200.0 rows, 201.0 cpu, 0.0 io}
> Set#2, type: RecordType(VARCHAR p)
>  rel#154:Subset#2.NONE.[], best=null, importance=0.81
>  rel#153:LogicalAggregate.NONE.[](input=rel#152:Subset#1.NONE.[],group={0}), 
> rowcount=10.0, cumulative cost={inf}
>  rel#161:Subset#2.BEAM_LOGICAL.[], best=rel#160, importance=0.405
>  
> rel#160:BeamAggregationRel.BEAM_LOGICAL.[](group={0},window=org.apache.beam.sdk.transforms.windowing.GlobalWindows,trigger=Repeatedly.forever(AfterWatermark.pastEndOfWindow())),
>  rowcount=10.0, cumulative cost={210.0 rows, 201.0 cpu, 0.0 io}
> Set#3, type: RecordType(BIGINT EXPR$0)
>  rel#156:Subset#3.NONE.[], best=null, importance=0.9
>  
> rel#155:LogicalAggregate.NONE.[](input=rel#154:Subset#2.NONE.[],group={},EXPR$0=COUNT()),
>  rowcount=1.0, cumulative cost={inf}
>  rel#157:Subset#3.BEAM_LOGICAL.[], best=null,
>  importance=1.0
>  
> rel#158:AbstractConverter.BEAM_LOGICAL.[](input=rel#156:Subset#3.NONE.[],convention=BEAM_LOGICAL,sort=[]),
>  rowcount=1.0, cumulative cost={inf}
> org.apache.beam.sdks.java.extensions.sql.repackaged.org.apache.calcite.plan.volcano.RelSubset$CheapestPlanReplacer.visit
>  (RelSubset.java:441)
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (BEAM-3481) Query with subquery and aggregates cannot be implemented.

Reply via email to