[
https://issues.apache.org/jira/browse/FLINK-3226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15145947#comment-15145947
]
ASF GitHub Bot commented on FLINK-3226:
---------------------------------------
Github user twalthr commented on a diff in the pull request:
https://github.com/apache/flink/pull/1632#discussion_r52827003
--- Diff:
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/rules/dataset/DataSetJoinRule.scala
---
@@ -39,18 +49,80 @@ class DataSetJoinRule
val convLeft: RelNode = RelOptRule.convert(join.getInput(0),
DataSetConvention.INSTANCE)
val convRight: RelNode = RelOptRule.convert(join.getInput(1),
DataSetConvention.INSTANCE)
- new DataSetJoin(
- rel.getCluster,
- traitSet,
- convLeft,
- convRight,
- rel.getRowType,
- join.toString,
- Array[Int](),
- Array[Int](),
- JoinType.INNER,
- null,
- null)
+ // get the equality keys
+ val joinInfo = join.analyzeCondition
+ val keyPairs = joinInfo.pairs
+
+ if (keyPairs.isEmpty) { // if no equality keys => not supported
+ throw new TableException("Joins should have at least one equality
condition")
+ }
+ else { // at least one equality expression => generate a join function
+ val conditionType = join.getCondition.getType
+ val func = getJoinFunction(join, joinInfo)
+ val leftKeys = ArrayBuffer.empty[Int]
+ val rightKeys = ArrayBuffer.empty[Int]
+
+ keyPairs.foreach(pair => {
+ leftKeys.add(pair.source)
+ rightKeys.add(pair.target)}
+ )
+
+ new DataSetJoin(
+ rel.getCluster,
+ traitSet,
+ convLeft,
+ convRight,
+ rel.getRowType,
+ join.toString,
+ leftKeys.toArray,
+ rightKeys.toArray,
+ JoinType.INNER,
+ null,
+ func)
+ }
+ }
+
+ def getJoinFunction(join: FlinkJoin, joinInfo: JoinInfo):
+ ((TableConfig, TypeInformation[Any], TypeInformation[Any],
TypeInformation[Any]) =>
+ FlatJoinFunction[Any, Any, Any]) = {
+
+ if (joinInfo.isEqui) {
+ // only equality condition => no join function necessary
+ null
--- End diff --
In general, `null` is not very welcome in Scala. Could you return a
`FlatJoinFunction` (containing only a `ConverterResultExpression`) here too? We
can then get rid of the `Tuple2RowMapper` and support any type as output type
of the join operation.
> Translate optimized logical Table API plans into physical plans representing
> DataSet programs
> ---------------------------------------------------------------------------------------------
>
> Key: FLINK-3226
> URL: https://issues.apache.org/jira/browse/FLINK-3226
> Project: Flink
> Issue Type: Sub-task
> Components: Table API
> Reporter: Fabian Hueske
> Assignee: Chengxiang Li
>
> This issue is about translating an (optimized) logical Table API (see
> FLINK-3225) query plan into a physical plan. The physical plan is a 1-to-1
> representation of the DataSet program that will be executed. This means:
> - Each Flink RelNode refers to exactly one Flink DataSet or DataStream
> operator.
> - All (join and grouping) keys of Flink operators are correctly specified.
> - The expressions which are to be executed in user-code are identified.
> - All fields are referenced with their physical execution-time index.
> - Flink type information is available.
> - Optional: Add physical execution hints for joins
> The translation should be the final part of Calcite's optimization process.
> For this task we need to:
> - implement a set of Flink DataSet RelNodes. Each RelNode corresponds to one
> Flink DataSet operator (Map, Reduce, Join, ...). The RelNodes must hold all
> relevant operator information (keys, user-code expression, strategy hints,
> parallelism).
> - implement rules to translate optimized Calcite RelNodes into Flink
> RelNodes. We start with a straight-forward mapping and later add rules that
> merge several relational operators into a single Flink operator, e.g., merge
> a join followed by a filter. Timo implemented some rules for the first SQL
> implementation which can be used as a starting point.
> - Integrate the translation rules into the Calcite optimization process
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)