[ https://issues.apache.org/jira/browse/FLINK-3941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15299736#comment-15299736 ]
ASF GitHub Bot commented on FLINK-3941: --------------------------------------- Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/2025#discussion_r64541694 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/nodes/dataset/DataSetUnion.scala --- @@ -69,16 +73,23 @@ class DataSetUnion( rows + metadata.getRowCount(child) } - planner.getCostFactory.makeCost(rowCnt, 0, 0) + planner.getCostFactory.makeCost( + rowCnt, + if (all) 0 else rowCnt, + if (all) 0 else rowCnt) } override def translateToPlan( tableEnv: BatchTableEnvironment, expectedType: Option[TypeInformation[Any]]): DataSet[Any] = { - val leftDataSet = left.asInstanceOf[DataSetRel].translateToPlan(tableEnv) - val rightDataSet = right.asInstanceOf[DataSetRel].translateToPlan(tableEnv) - leftDataSet.union(rightDataSet).asInstanceOf[DataSet[Any]] + val leftDataSet = left.asInstanceOf[DataSetRel].translateToPlan(tableEnv, expectedType) + val rightDataSet = right.asInstanceOf[DataSetRel].translateToPlan(tableEnv, expectedType) + if (all) { + leftDataSet.union(rightDataSet).asInstanceOf[DataSet[Any]] + } else { + leftDataSet.union(rightDataSet).distinct().asInstanceOf[DataSet[Any]] --- End diff -- Oh, yes. Completely forgot about that rule... 😊 So, we already supported the non-all union for SQL. Only the Table API was missing the `union()` method. I think there are two ways to continue: - remove the `UnionToDistinctRule` from `FlinkRuleSets` - revert the changes on `DataSetUnion` (except of pushing down the `expectedType`) and `DataSetUnionRule`. I am fine either ways. > Add support for UNION (with duplicate elimination) > -------------------------------------------------- > > Key: FLINK-3941 > URL: https://issues.apache.org/jira/browse/FLINK-3941 > Project: Flink > Issue Type: New Feature > Components: Table API > Affects Versions: 1.1.0 > Reporter: Fabian Hueske > Assignee: Yijie Shen > Priority: Minor > > Currently, only UNION ALL is supported by Table API and SQL. > UNION (with duplicate elimination) can be supported by applying a > {{DataSet.distinct()}} after the union on all fields. This issue includes: > - Extending {{DataSetUnion}} > - Relaxing {{DataSetUnionRule}} to translated non-all unions. > - Extend the Table API with union() method. -- This message was sent by Atlassian JIRA (v6.3.4#6332)