[
https://issues.apache.org/jira/browse/FLINK-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15356206#comment-15356206
]
ASF GitHub Bot commented on FLINK-3943:
---------------------------------------
Github user fhueske commented on a diff in the pull request:
https://github.com/apache/flink/pull/2169#discussion_r69048813
--- Diff:
flink-libraries/flink-table/src/test/scala/org/apache/flink/api/scala/batch/table/SetOperationsITCase.scala
---
@@ -139,4 +154,105 @@ class UnionITCase(
// Must fail. Tables are bound to different TableEnvironments.
ds1.unionAll(ds2).select('c)
}
+
+ @Test
+ def testSetMinusAll(): Unit = {
+ val env: ExecutionEnvironment =
ExecutionEnvironment.getExecutionEnvironment
+ val tEnv = TableEnvironment.getTableEnvironment(env, config)
+
+ val ds1 = CollectionDataSets.getSmall3TupleDataSet(env).toTable(tEnv,
'a, 'b, 'c)
+ val ds2 =
CollectionDataSets.getOneElement3TupleDataSet(env).toTable(tEnv, 'a, 'b, 'c)
+
+ val minusDs = ds1.minusAll(ds2).select('c)
+
+ val results = minusDs.toDataSet[Row].collect()
+ val expected = "Hello\n" + "Hello world\n"
+ TestBaseUtils.compareResultAsText(results.asJava, expected)
+ }
+
+ @Test
+ def testSetMinusAllWithDuplicates(): Unit = {
+ val env: ExecutionEnvironment =
ExecutionEnvironment.getExecutionEnvironment
+ val tEnv = TableEnvironment.getTableEnvironment(env, config)
+
+ val ds1 = CollectionDataSets.getSmall3TupleDataSet(env).toTable(tEnv,
'a, 'b, 'c)
+ val ds2 = CollectionDataSets.getSmall3TupleDataSet(env).toTable(tEnv,
'a, 'b, 'c)
+ val ds3 =
CollectionDataSets.getOneElement3TupleDataSet(env).toTable(tEnv, 'a, 'b, 'c)
+
+ val minusDs = ds1.unionAll(ds2).minusAll(ds3).select('c)
+
+ val results = minusDs.toDataSet[Row].collect()
+ val expected = "Hello\n" + "Hello world\n" +
+ "Hello\n" + "Hello world\n"
+ TestBaseUtils.compareResultAsText(results.asJava, expected)
+ }
+
+ @Test
+ def testSetMinus(): Unit = {
--- End diff --
Can you combine this and the next test by using test data that covers both
cases for different records, i.e., have some records with duplicates in this
first, second, none, and both data sets.
> Add support for EXCEPT (set minus)
> ----------------------------------
>
> Key: FLINK-3943
> URL: https://issues.apache.org/jira/browse/FLINK-3943
> Project: Flink
> Issue Type: New Feature
> Components: Table API & SQL
> Affects Versions: 1.1.0
> Reporter: Fabian Hueske
> Assignee: Ivan Mushketyk
> Priority: Minor
>
> Currently, the Table API and SQL do not support EXCEPT.
> EXCEPT can be executed as a coGroup on all fields that forwards records of
> the first input if the second input is empty.
> In order to add support for EXCEPT to the Table API and SQL we need to:
> - Implement a {{DataSetMinus}} class that translates an EXCEPT into a DataSet
> API program using a coGroup on all fields.
> - Implement a {{DataSetMinusRule}} that translates a Calcite {{LogicalMinus}}
> into a {{DataSetMinus}}.
> - Extend the Table API (and validation phase) to provide an except() method.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)