[ 
https://issues.apache.org/jira/browse/BEAM-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16933041#comment-16933041
 ] 

sridhar Reddy commented on BEAM-7049:
-------------------------------------

I did the tests with the following steps
 # re-clone a new repo of Beam
 # add UnionMergeRule
 # change {{BeamCostModel.FACTORY}} to {{null}} at 
[https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/CalciteQueryPlanner.java#L116]

when using the query  

{{SELECT 1
UNION ALL
SELECT 2
UNION ALL
SELECT 3
UNION ALL
SELECT 4
UNION ALL
SELECT 5}}

It didn't work as expected. SQLPlan was generated but BeamPlan was not 
generated. Here is the condensed stack trace

-------------

Error while applying rule BeamUnionRule, args 
[rel#45:LogicalUnion.NONE(input#0=RelSubset#42,input#1=RelSubset#44,all=true)]Error
 while applying rule BeamUnionRule, args 
[rel#45:LogicalUnion.NONE(input#0=RelSubset#42,input#1=RelSubset#44,all=true)]java.lang.RuntimeException:
 Error while applying rule BeamUnionRule, args 
[rel#45:LogicalUnion.NONE(input#0=RelSubset#42,input#1=RelSubset#44,all=true)] 
at 
org.apache.beam.repackaged.sql.org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:235)
 at 
org.apache.beam.repackaged.sql.org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:631)

Caused by: java.lang.RuntimeException: Error occurred while applying rule 
BeamUnionRuleCaused by: java.lang.RuntimeException: Error occurred while 
applying rule BeamUnionRule at 
org.apache.beam.repackaged.sql.org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(VolcanoRuleCall.java:143)
 at 
org.apache.beam.repackaged.sql.org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:236)

Caused by: java.lang.ClassCastException: 
org.apache.beam.sdk.extensions.sql.impl.planner.BeamCostModel cannot be cast to 
org.apache.beam.repackaged.sql.org.apache.calcite.plan.volcano.VolcanoCostCaused
 by: java.lang.ClassCastException: 
org.apache.beam.sdk.extensions.sql.impl.planner.BeamCostModel cannot be cast to 
org.apache.beam.repackaged.sql.org.apache.calcite.plan.volcano.VolcanoCost at 
org.apache.beam.repackaged.sql.org.apache.calcite.plan.volcano.VolcanoCost.isLt(VolcanoCost.java:112)
 at 
org.apache.beam.repackaged.sql.org.apache.calcite.plan.volcano.VolcanoPlanner.getCost(VolcanoPlanner.java:930)
 at 
org.apache.beam.repackaged.sql.org.apache.calcite.plan.volcano.RelSubset.propagateCostImprovements0(RelSubset.java:347)

---------

The same result is observed for the following query also (no union all)

SELECT 1 UNION  SELECT 2 UNION  SELECT 3 UNION  SELECT 4 UNION  SELECT 5

 

However, just using the fresh clone without modifications "union all" query as 
expected and sends 5 inputs but "union" query only sends 3 inputs. This can 
also be observed in shortened BEAMPlan

INFO: BEAMPlan>
BeamUnionRel(all=[true])
 BeamCalcRel(expr#0=[\{inputs}], expr#1=[1], EXPR$0=[$t1])
 BeamValuesRel(tuples=[[\{ 0 }]])
 BeamCalcRel(expr#0=[\{inputs}], expr#1=[2], EXPR$0=[$t1])
 BeamValuesRel(tuples=[[\{ 0 }]])
 BeamCalcRel(expr#0=[\{inputs}], expr#1=[3], EXPR$0=[$t1])
 BeamValuesRel(tuples=[[\{ 0 }]])
 BeamCalcRel(expr#0=[\{inputs}], expr#1=[4], EXPR$0=[$t1])
 BeamValuesRel(tuples=[[\{ 0 }]])
 BeamCalcRel(expr#0=[\{inputs}], expr#1=[5], EXPR$0=[$t1])
 BeamValuesRel(tuples=[[\{ 0 }]])

vs

INFO: BEAMPlan>
BeamUnionRel(all=[false])
 BeamCalcRel(expr#0=[\{inputs}], expr#1=[1], EXPR$0=[$t1])
 BeamValuesRel(tuples=[[\{ 0 }]])
 BeamCalcRel(expr#0=[\{inputs}], expr#1=[2], EXPR$0=[$t1])
 BeamValuesRel(tuples=[[\{ 0 }]])
 BeamCalcRel(expr#0=[\{inputs}], expr#1=[3], EXPR$0=[$t1])
 BeamValuesRel(tuples=[[\{ 0 }]])

 

 

 

 

 

> Merge multiple input to one BeamUnionRel
> ----------------------------------------
>
>                 Key: BEAM-7049
>                 URL: https://issues.apache.org/jira/browse/BEAM-7049
>             Project: Beam
>          Issue Type: Improvement
>          Components: dsl-sql
>            Reporter: Rui Wang
>            Assignee: sridhar Reddy
>            Priority: Major
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> BeamUnionRel assumes inputs are two and rejects more. So `a UNION b UNION c` 
> will have to be created as UNION(a, UNION(b, c)) and have two shuffles. If 
> BeamUnionRel can handle multiple shuffles, we will have only one shuffle



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to