[
https://issues.apache.org/jira/browse/FLINK-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946807#comment-14946807
]
ASF GitHub Bot commented on FLINK-2828:
---------------------------------------
GitHub user twalthr opened a pull request:
https://github.com/apache/flink/pull/1237
[FLINK-2828] [table] Add interfaces for Table API input formats
This PR implements TableSources as described in FLINK-2828. It is an
uncoupling of #1127 for FLINK-2167. All feedback I got from @aljoscha is
implemented.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/twalthr/flink TableApiInputs
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/1237.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #1237
----
commit 67a0ac4af4dc3a063e699e6eed24a2c7177eb31e
Author: twalthr <[email protected]>
Date: 2015-07-09T09:57:05Z
[FLINK-2828] [table] Add interfaces for Table API input formats
----
> Add interfaces for Table API input formats
> ------------------------------------------
>
> Key: FLINK-2828
> URL: https://issues.apache.org/jira/browse/FLINK-2828
> Project: Flink
> Issue Type: New Feature
> Components: Table API
> Reporter: Timo Walther
> Assignee: Timo Walther
>
> In order to support input formats for the Table API, interfaces are
> necessary. I propose two types of TableSources:
> - AdaptiveTableSources can adapt their output to the requirements of the
> plan. Although the output schema stays the same, the TableSource can react on
> field resolution and/or predicates internally and can return adapted
> DataSet/DataStream versions in the "translate" step.
> - StaticTableSources are an easy way to provide the Table API with additional
> input formats without much implementation effort (e.g. for fromCsvFile())
> TableSources need to be deeply integrated into the Table API.
> The TableEnvironment requires a newly introduced AbstractExecutionEnvironment
> (common super class of all ExecutionEnvironments for DataSets and
> DataStreams).
> Here's what a TableSource can see from more complicated queries:
> {code}
> getTableJava(tableSource1)
> .filter("a===5 || a===6")
> .select("a as a4, b as b4, c as c4")
> .filter("b4===7")
> .join(getTableJava(tableSource2))
> .where("a===a4 && c==='Test' && c4==='Test2'")
> // Result predicates for tableSource1:
> // List("a===5 || a===6", "b===7", "c==='Test2'")
> // Result predicates for tableSource2:
> // List("c==='Test'")
> // Result resolved fields for tableSource1 (true = filtering,
> false=selection):
> // Set(("a", true), ("a", false), ("b", true), ("b", false), ("c", false),
> ("c", true))
> // Result resolved fields for tableSource2 (true = filtering,
> false=selection):
> // Set(("a", true), ("c", true))
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)