[
https://issues.apache.org/jira/browse/BEAM-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16912882#comment-16912882
]
sridhar Reddy commented on BEAM-7049:
-------------------------------------
Just including UnionMergeRule.INSTANCE in BeamRuleSets makes union with 3
operands works without any hardcoding. 4 Operands may need a new merge rule
altogether.
[~amaliujia] are we limiting the number of operands to some number to a small
number? or are we generalizing to number n ?
If we are generalizing to a number "n" then one of the sticky points I am
facing is creating KeyedPCollectionTuple [1] which takes TupleTag as one of
its parameters and I am not sure if there is a way to generate on the fly? If
we just create a new TupleTag object without assigning to a variable then the
following call will be affected and some changes need to be cascaded down the
line.
[2]BeamSetOperatorsTransforms.SetOperatorFilteringDoFn()
What do you suggest?
1.[https://github.com/apache/beam/blob/3561100b30b64e4ac857afbf6e5016dfaf2ecc22/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamSetOperatorRelBase.java#L86]
2.
[https://github.com/apache/beam/blob/3561100b30b64e4ac857afbf6e5016dfaf2ecc22/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamSetOperatorRelBase.java#L100]
> Merge multiple input to one BeamUnionRel
> ----------------------------------------
>
> Key: BEAM-7049
> URL: https://issues.apache.org/jira/browse/BEAM-7049
> Project: Beam
> Issue Type: Improvement
> Components: dsl-sql
> Reporter: Rui Wang
> Assignee: sridhar Reddy
> Priority: Major
> Time Spent: 20m
> Remaining Estimate: 0h
>
> BeamUnionRel assumes inputs are two and rejects more. So `a UNION b UNION c`
> will have to be created as UNION(a, UNION(b, c)) and have two shuffles. If
> BeamUnionRel can handle multiple shuffles, we will have only one shuffle
--
This message was sent by Atlassian Jira
(v8.3.2#803003)