[
https://issues.apache.org/jira/browse/FLINK-3226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15095875#comment-15095875
]
Fabian Hueske commented on FLINK-3226:
--------------------------------------
Welcome on board [~chengxiang li] :-)
I think a good way to start with this issue to have a look at [~twalthr]'s [SQL
branch|https://github.com/twalthr/flink/tree/FlinkSQL]. It translates SQL
queries into Table API PlanNodes. For this issue we need to do something that
is similar but goes a bit further in the translation.
First, we need to design the Flink DataSet RelNodes. This representation is a
bit different from the current PlanNode representation because it should be a
1-to-1 representation of the final DataSet program. This needs to be
coordinated with FLINK-3227 in case somebody picks it up before this issue is
resolved (I think [~twalthr] said would be interested).
Second, we need to translate the Calcite RelNodes into Flink RelNodes. With
[~twalthr]'s branch, you can easily define RelNode trees from SQL queries (the
Table API will be translated into the same representation) and work on
translating them into a Flink RelNode representation.
I wrote before, that I would like to coordinate the work on the subissues of
FLINK-3221 on a feature branch and merge the branch to the master branch once
all subissues have been resolved. Right now, we have an open PR that moves all
Table API classes ([PR #1492|https://github.com/apache/flink/pull/1492]). I
would like to wait with forking of the feature branch until that PR is merged.
> Translate optimized logical Table API plans into physical plans representing
> DataSet programs
> ---------------------------------------------------------------------------------------------
>
> Key: FLINK-3226
> URL: https://issues.apache.org/jira/browse/FLINK-3226
> Project: Flink
> Issue Type: Sub-task
> Components: Table API
> Reporter: Fabian Hueske
>
> This issue is about translating an (optimized) logical Table API (see
> FLINK-3225) query plan into a physical plan. The physical plan is a 1-to-1
> representation of the DataSet program that will be executed. This means:
> - Each Flink RelNode refers to exactly one Flink DataSet or DataStream
> operator.
> - All (join and grouping) keys of Flink operators are correctly specified.
> - The expressions which are to be executed in user-code are identified.
> - All fields are referenced with their physical execution-time index.
> - Flink type information is available.
> - Optional: Add physical execution hints for joins
> The translation should be the final part of Calcite's optimization process.
> For this task we need to:
> - implement a set of Flink DataSet RelNodes. Each RelNode corresponds to one
> Flink DataSet operator (Map, Reduce, Join, ...). The RelNodes must hold all
> relevant operator information (keys, user-code expression, strategy hints,
> parallelism).
> - implement rules to translate optimized Calcite RelNodes into Flink
> RelNodes. We start with a straight-forward mapping and later add rules that
> merge several relational operators into a single Flink operator, e.g., merge
> a join followed by a filter. Timo implemented some rules for the first SQL
> implementation which can be used as a starting point.
> - Integrate the translation rules into the Calcite optimization process
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)