[
https://issues.apache.org/jira/browse/BEAM-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16919139#comment-16919139
]
sridhar Reddy commented on BEAM-7049:
-------------------------------------
After several attempts to generalize union with multiple operands, I am unable
to locate a place where this can be done. Any attempt to pull inputs from lower
operands to upper operands actually limits output. I am not sure I am looking
in the right places.
[~amaliujia] Can you please confirm that UnionMergeRule (or variation of it) is
the right place to make these changes?
[https://github.com/Qihoo360/Quicksql/blob/7863cff886ad233b5102d6e2f11f8d21f374aa82/analysis/src/main/java/org/apache/calcite/rel/rules/UnionMergeRule.java#L128]
I tried to make different changes in the above location but without any luck.
In fact any changes made there limits union result. For ex: select 1 union 2
union 3 union 4 union results in [1,2,3] skipping 4.
Just trying to hardcode in BeamUnionRel also was not possible beyond 3 inputs.
[https://github.com/apache/beam/blob/9e152b7a99b2e081d224584905a49e14742e0d5d/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamUnionRel.java#L76]
Where ever I looked most of the operations are done on 2 or 3 operands only.
Only in CaliciteQueryPlanner, I can find everything in one place
[https://github.com/apache/beam/blob/9e152b7a99b2e081d224584905a49e14742e0d5d/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/CalciteQueryPlanner.java#L169]
Is there a place where you can access - beamRelNode or a similar structure
where all the inputs are present- and can be manipulated? CalciteQueryPlanner
doesn't seem like a good place to do this.
> Merge multiple input to one BeamUnionRel
> ----------------------------------------
>
> Key: BEAM-7049
> URL: https://issues.apache.org/jira/browse/BEAM-7049
> Project: Beam
> Issue Type: Improvement
> Components: dsl-sql
> Reporter: Rui Wang
> Assignee: sridhar Reddy
> Priority: Major
> Time Spent: 20m
> Remaining Estimate: 0h
>
> BeamUnionRel assumes inputs are two and rejects more. So `a UNION b UNION c`
> will have to be created as UNION(a, UNION(b, c)) and have two shuffles. If
> BeamUnionRel can handle multiple shuffles, we will have only one shuffle
--
This message was sent by Atlassian Jira
(v8.3.2#803003)