[jira] [Commented] (FLINK-2828) Add interfaces for Table API input formats

ASF GitHub Bot (JIRA) Mon, 07 Dec 2015 10:51:41 -0800

    [ 
https://issues.apache.org/jira/browse/FLINK-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15045471#comment-15045471
 ]


ASF GitHub Bot commented on FLINK-2828:
---------------------------------------

Github user twalthr commented on the pull request:

    https://github.com/apache/flink/pull/1237#issuecomment-162622155
  
    So basically what you want to say is: throw your entire code away.
    I'm disappointed that you think about the whole concept now, after I have 
rebased and reworked the code multiple times. I published a prototype of it in 
mid-September (#1127), where I wanted to know if the concept is fine. I won't 
have the resources and motivation to start from scratch.
    
    Yes, you are right. Your solution is the nicer one. However, the question 
is how much optimizer logic the Table API should have implemented. Actually I 
wanted to discuss the following under FLINK-2099, but we can also discuss it 
here. If we base some further developed library such as Calcite on top of the 
Table API, optimizer logic will be unnecessary in the near future. In a long 
term perspective it might make sense to convert Table API DSL to Calcite nodes, 
optimize them, and use the Table API plan nodes only as the physical plan nodes 
without further optimization. In this case the only thing missing is 
predicate/projection pushdown from the Table API tree into the input formats. 
Calcite could do push down for us.


> Add interfaces for Table API input formats
> ------------------------------------------
>
>                 Key: FLINK-2828
>                 URL: https://issues.apache.org/jira/browse/FLINK-2828
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table API
>            Reporter: Timo Walther
>            Assignee: Timo Walther
>
> In order to support input formats for the Table API, interfaces are 
> necessary. I propose two types of TableSources:
> - AdaptiveTableSources can adapt their output to the requirements of the 
> plan. Although the output schema stays the same, the TableSource can react on 
> field resolution and/or predicates internally and can return adapted 
> DataSet/DataStream versions in the "translate" step.
> - StaticTableSources are an easy way to provide the Table API with additional 
> input formats without much implementation effort (e.g. for fromCsvFile())
> TableSources need to be deeply integrated into the Table API.
> The TableEnvironment requires a newly introduced AbstractExecutionEnvironment 
> (common super class of all ExecutionEnvironments for DataSets and 
> DataStreams).
> Here's what a TableSource can see from more complicated queries:
> {code}
> getTableJava(tableSource1)
>   .filter("a===5 || a===6")
>   .select("a as a4, b as b4, c as c4")
>   .filter("b4===7")
>   .join(getTableJava(tableSource2))
>   .where("a===a4 && c==='Test' && c4==='Test2'")
> // Result predicates for tableSource1:
> //  List("a===5 || a===6", "b===7", "c==='Test2'")
> // Result predicates for tableSource2:
> //  List("c==='Test'")
> // Result resolved fields for tableSource1 (true = filtering, 
> false=selection):
> //  Set(("a", true), ("a", false), ("b", true), ("b", false), ("c", false), 
> ("c", true))
> // Result resolved fields for tableSource2 (true = filtering, 
> false=selection):
> //  Set(("a", true), ("c", true))
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-2828) Add interfaces for Table API input formats

Reply via email to