[ https://issues.apache.org/jira/browse/FLINK-3941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15299795#comment-15299795 ]
ASF GitHub Bot commented on FLINK-3941: --------------------------------------- Github user yjshen commented on a diff in the pull request: https://github.com/apache/flink/pull/2025#discussion_r64547780 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/nodes/dataset/DataSetUnion.scala --- @@ -69,16 +73,23 @@ class DataSetUnion( rows + metadata.getRowCount(child) } - planner.getCostFactory.makeCost(rowCnt, 0, 0) + planner.getCostFactory.makeCost( + rowCnt, + if (all) 0 else rowCnt, + if (all) 0 else rowCnt) } override def translateToPlan( tableEnv: BatchTableEnvironment, expectedType: Option[TypeInformation[Any]]): DataSet[Any] = { - val leftDataSet = left.asInstanceOf[DataSetRel].translateToPlan(tableEnv) - val rightDataSet = right.asInstanceOf[DataSetRel].translateToPlan(tableEnv) - leftDataSet.union(rightDataSet).asInstanceOf[DataSet[Any]] + val leftDataSet = left.asInstanceOf[DataSetRel].translateToPlan(tableEnv, expectedType) + val rightDataSet = right.asInstanceOf[DataSetRel].translateToPlan(tableEnv, expectedType) + if (all) { + leftDataSet.union(rightDataSet).asInstanceOf[DataSet[Any]] + } else { + leftDataSet.union(rightDataSet).distinct().asInstanceOf[DataSet[Any]] --- End diff -- I have question here on computeSelfCost, when will this method called? Is that possible we are using union's computeSelfCompute first, before UnionToDistinctRule are called? I was trying to understand this last night but didn't come up with an idea. > Add support for UNION (with duplicate elimination) > -------------------------------------------------- > > Key: FLINK-3941 > URL: https://issues.apache.org/jira/browse/FLINK-3941 > Project: Flink > Issue Type: New Feature > Components: Table API > Affects Versions: 1.1.0 > Reporter: Fabian Hueske > Assignee: Yijie Shen > Priority: Minor > > Currently, only UNION ALL is supported by Table API and SQL. > UNION (with duplicate elimination) can be supported by applying a > {{DataSet.distinct()}} after the union on all fields. This issue includes: > - Extending {{DataSetUnion}} > - Relaxing {{DataSetUnionRule}} to translated non-all unions. > - Extend the Table API with union() method. -- This message was sent by Atlassian JIRA (v6.3.4#6332)