[jira] [Work logged] (BEAM-7049) Merge multiple input to one BeamUnionRel
[ https://issues.apache.org/jira/browse/BEAM-7049?focusedWorklogId=346376&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-346376 ] ASF GitHub Bot logged work on BEAM-7049: Author: ASF GitHub Bot Created on: 20/Nov/19 00:28 Start Date: 20/Nov/19 00:28 Worklog Time Spent: 10m Work Description: stale[bot] commented on issue #9358: (WIP-BEAM-7049)Changes made to make a simple case of threeway union work URL: https://github.com/apache/beam/pull/9358#issuecomment-555778972 This pull request has been closed due to lack of activity. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 346376) Time Spent: 1h 20m (was: 1h 10m) > Merge multiple input to one BeamUnionRel > > > Key: BEAM-7049 > URL: https://issues.apache.org/jira/browse/BEAM-7049 > Project: Beam > Issue Type: Improvement > Components: dsl-sql >Reporter: Rui Wang >Assignee: sridhar Reddy >Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > > BeamUnionRel assumes inputs are two and rejects more. So `a UNION b UNION c` > will have to be created as UNION(a, UNION(b, c)) and have two shuffles. If > BeamUnionRel can handle multiple shuffles, we will have only one shuffle -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7049) Merge multiple input to one BeamUnionRel
[ https://issues.apache.org/jira/browse/BEAM-7049?focusedWorklogId=346378&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-346378 ] ASF GitHub Bot logged work on BEAM-7049: Author: ASF GitHub Bot Created on: 20/Nov/19 00:28 Start Date: 20/Nov/19 00:28 Worklog Time Spent: 10m Work Description: stale[bot] commented on pull request #9358: (WIP-BEAM-7049)Changes made to make a simple case of threeway union work URL: https://github.com/apache/beam/pull/9358 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 346378) Time Spent: 1.5h (was: 1h 20m) > Merge multiple input to one BeamUnionRel > > > Key: BEAM-7049 > URL: https://issues.apache.org/jira/browse/BEAM-7049 > Project: Beam > Issue Type: Improvement > Components: dsl-sql >Reporter: Rui Wang >Assignee: sridhar Reddy >Priority: Major > Time Spent: 1.5h > Remaining Estimate: 0h > > BeamUnionRel assumes inputs are two and rejects more. So `a UNION b UNION c` > will have to be created as UNION(a, UNION(b, c)) and have two shuffles. If > BeamUnionRel can handle multiple shuffles, we will have only one shuffle -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7049) Merge multiple input to one BeamUnionRel
[ https://issues.apache.org/jira/browse/BEAM-7049?focusedWorklogId=342170&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-342170 ] ASF GitHub Bot logged work on BEAM-7049: Author: ASF GitHub Bot Created on: 12/Nov/19 21:24 Start Date: 12/Nov/19 21:24 Worklog Time Spent: 10m Work Description: stale[bot] commented on issue #9358: (WIP-BEAM-7049)Changes made to make a simple case of threeway union work URL: https://github.com/apache/beam/pull/9358#issuecomment-553121577 This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the d...@beam.apache.org list. Thank you for your contributions. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 342170) Time Spent: 1h 10m (was: 1h) > Merge multiple input to one BeamUnionRel > > > Key: BEAM-7049 > URL: https://issues.apache.org/jira/browse/BEAM-7049 > Project: Beam > Issue Type: Improvement > Components: dsl-sql >Reporter: Rui Wang >Assignee: sridhar Reddy >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > BeamUnionRel assumes inputs are two and rejects more. So `a UNION b UNION c` > will have to be created as UNION(a, UNION(b, c)) and have two shuffles. If > BeamUnionRel can handle multiple shuffles, we will have only one shuffle -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7049) Merge multiple input to one BeamUnionRel
[ https://issues.apache.org/jira/browse/BEAM-7049?focusedWorklogId=312302&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-312302 ] ASF GitHub Bot logged work on BEAM-7049: Author: ASF GitHub Bot Created on: 13/Sep/19 20:03 Start Date: 13/Sep/19 20:03 Worklog Time Spent: 10m Work Description: amaliujia commented on pull request #9358: (WIP-BEAM-7049)Changes made to make a simple case of threeway union work URL: https://github.com/apache/beam/pull/9358#discussion_r324347562 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/planner/BeamRuleSets.java ## @@ -141,6 +142,7 @@ private static final List BEAM_CONVERTERS = ImmutableList.of( + UnionMergeRule.INSTANCE, //Added for three way union Review comment: If it still does not work, you could 1. re-clone a new repo of Beam 2. add UnionMergeRule 3. change `BeamCostModel.FACTORY` to `null` at https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/CalciteQueryPlanner.java#L116 Then quickly test ``` SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 312302) Time Spent: 1h (was: 50m) > Merge multiple input to one BeamUnionRel > > > Key: BEAM-7049 > URL: https://issues.apache.org/jira/browse/BEAM-7049 > Project: Beam > Issue Type: Improvement > Components: dsl-sql >Reporter: Rui Wang >Assignee: sridhar Reddy >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > BeamUnionRel assumes inputs are two and rejects more. So `a UNION b UNION c` > will have to be created as UNION(a, UNION(b, c)) and have two shuffles. If > BeamUnionRel can handle multiple shuffles, we will have only one shuffle -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7049) Merge multiple input to one BeamUnionRel
[ https://issues.apache.org/jira/browse/BEAM-7049?focusedWorklogId=312300&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-312300 ] ASF GitHub Bot logged work on BEAM-7049: Author: ASF GitHub Bot Created on: 13/Sep/19 20:02 Start Date: 13/Sep/19 20:02 Worklog Time Spent: 10m Work Description: amaliujia commented on pull request #9358: (WIP-BEAM-7049)Changes made to make a simple case of threeway union work URL: https://github.com/apache/beam/pull/9358#discussion_r324347562 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/planner/BeamRuleSets.java ## @@ -141,6 +142,7 @@ private static final List BEAM_CONVERTERS = ImmutableList.of( + UnionMergeRule.INSTANCE, //Added for three way union Review comment: If it does not work, you could 1. re-clone a new repo of Beam 2. add UnionMergeRule 3. change `BeamCostModel.FACTORY` to `null` at https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/CalciteQueryPlanner.java#L116 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 312300) Time Spent: 40m (was: 0.5h) > Merge multiple input to one BeamUnionRel > > > Key: BEAM-7049 > URL: https://issues.apache.org/jira/browse/BEAM-7049 > Project: Beam > Issue Type: Improvement > Components: dsl-sql >Reporter: Rui Wang >Assignee: sridhar Reddy >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > BeamUnionRel assumes inputs are two and rejects more. So `a UNION b UNION c` > will have to be created as UNION(a, UNION(b, c)) and have two shuffles. If > BeamUnionRel can handle multiple shuffles, we will have only one shuffle -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7049) Merge multiple input to one BeamUnionRel
[ https://issues.apache.org/jira/browse/BEAM-7049?focusedWorklogId=312301&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-312301 ] ASF GitHub Bot logged work on BEAM-7049: Author: ASF GitHub Bot Created on: 13/Sep/19 20:02 Start Date: 13/Sep/19 20:02 Worklog Time Spent: 10m Work Description: amaliujia commented on pull request #9358: (WIP-BEAM-7049)Changes made to make a simple case of threeway union work URL: https://github.com/apache/beam/pull/9358#discussion_r324347562 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/planner/BeamRuleSets.java ## @@ -141,6 +142,7 @@ private static final List BEAM_CONVERTERS = ImmutableList.of( + UnionMergeRule.INSTANCE, //Added for three way union Review comment: If it still does not work, you could 1. re-clone a new repo of Beam 2. add UnionMergeRule 3. change `BeamCostModel.FACTORY` to `null` at https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/CalciteQueryPlanner.java#L116 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 312301) Time Spent: 50m (was: 40m) > Merge multiple input to one BeamUnionRel > > > Key: BEAM-7049 > URL: https://issues.apache.org/jira/browse/BEAM-7049 > Project: Beam > Issue Type: Improvement > Components: dsl-sql >Reporter: Rui Wang >Assignee: sridhar Reddy >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > BeamUnionRel assumes inputs are two and rejects more. So `a UNION b UNION c` > will have to be created as UNION(a, UNION(b, c)) and have two shuffles. If > BeamUnionRel can handle multiple shuffles, we will have only one shuffle -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7049) Merge multiple input to one BeamUnionRel
[ https://issues.apache.org/jira/browse/BEAM-7049?focusedWorklogId=312284&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-312284 ] ASF GitHub Bot logged work on BEAM-7049: Author: ASF GitHub Bot Created on: 13/Sep/19 19:34 Start Date: 13/Sep/19 19:34 Worklog Time Spent: 10m Work Description: amaliujia commented on pull request #9358: (WIP-BEAM-7049)Changes made to make a simple case of threeway union work URL: https://github.com/apache/beam/pull/9358#discussion_r324338919 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/planner/BeamRuleSets.java ## @@ -141,6 +142,7 @@ private static final List BEAM_CONVERTERS = ImmutableList.of( + UnionMergeRule.INSTANCE, //Added for three way union Review comment: Ah. The only difference of us might just be the place to add this rule: I added to LOGICAL_OPTIMIZATIONS above. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 312284) Time Spent: 0.5h (was: 20m) > Merge multiple input to one BeamUnionRel > > > Key: BEAM-7049 > URL: https://issues.apache.org/jira/browse/BEAM-7049 > Project: Beam > Issue Type: Improvement > Components: dsl-sql >Reporter: Rui Wang >Assignee: sridhar Reddy >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > BeamUnionRel assumes inputs are two and rejects more. So `a UNION b UNION c` > will have to be created as UNION(a, UNION(b, c)) and have two shuffles. If > BeamUnionRel can handle multiple shuffles, we will have only one shuffle -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7049) Merge multiple input to one BeamUnionRel
[ https://issues.apache.org/jira/browse/BEAM-7049?focusedWorklogId=296566&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296566 ] ASF GitHub Bot logged work on BEAM-7049: Author: ASF GitHub Bot Created on: 16/Aug/19 20:51 Start Date: 16/Aug/19 20:51 Worklog Time Spent: 10m Work Description: amaliujia commented on pull request #9358: (WIP-BEAM-7049)Changes made to make a simple case of threeway union work URL: https://github.com/apache/beam/pull/9358#discussion_r314885625 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamSetOperatorRelBase.java ## @@ -88,15 +92,20 @@ public BeamSetOperatorRelBase(BeamRelNode beamRelNode, OpType opType, boolean al leftRows.apply( "CreateLeftIndex", MapElements.via(new BeamSetOperatorsTransforms.BeamSqlRow2KvFn( -.and( -rightTag, -rightRows.apply( -"CreateRightIndex", -MapElements.via(new BeamSetOperatorsTransforms.BeamSqlRow2KvFn( -.apply(CoGroupByKey.create()); +.and( Review comment: note that to make it general, it would be a for loop to create one tag per input. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 296566) Time Spent: 20m (was: 10m) > Merge multiple input to one BeamUnionRel > > > Key: BEAM-7049 > URL: https://issues.apache.org/jira/browse/BEAM-7049 > Project: Beam > Issue Type: Improvement > Components: dsl-sql >Reporter: Rui Wang >Assignee: sridhar Reddy >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > BeamUnionRel assumes inputs are two and rejects more. So `a UNION b UNION c` > will have to be created as UNION(a, UNION(b, c)) and have two shuffles. If > BeamUnionRel can handle multiple shuffles, we will have only one shuffle -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Work logged] (BEAM-7049) Merge multiple input to one BeamUnionRel
[ https://issues.apache.org/jira/browse/BEAM-7049?focusedWorklogId=296035&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296035 ] ASF GitHub Bot logged work on BEAM-7049: Author: ASF GitHub Bot Created on: 16/Aug/19 03:31 Start Date: 16/Aug/19 03:31 Worklog Time Spent: 10m Work Description: sridharinuog commented on pull request #9358: (WIP-BEAM-7049)Changes made to make a simple case of threeway union work URL: https://github.com/apache/beam/pull/9358 Code changes to implement 3 way union work. This is still in very early stages and may break existing 2 way union. More changes to come. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build Status](https://builds.apac