Kurt Young created FLINK-6066:
---------------------------------

             Summary: Separate logical and physical RelNode layer in Flink
                 Key: FLINK-6066
                 URL: https://issues.apache.org/jira/browse/FLINK-6066
             Project: Flink
          Issue Type: Sub-task
          Components: Table API & SQL
            Reporter: Kurt Young


Currently flink-table contains two layer of RelNodes to work with Calcite. One 
is actually from Calcite itself, such as TableScan, Project, Filter and so on. 
Then depends on what environment we are using, the RelNode translate to 
DataSetXXX or DataStreamXXX, like DataSetScan or DataStreamAggregate. All the 
optimization rules happened in the phase in a cost base manner. 

I suppose to further separate the second layer into two, one is more logical 
just like Calcite, and the other one is more physical. In the logical layer, we 
can do lots of optimization without real statistics involved, like partition 
pruning, projection pushdown. And we may even use rule-based optimization for 
logical optimize. In physical optimize phase, we then introduce some real 
statistics and to choose what ever physical strategy we want to use in a cost 
base manner, like join strategy selection or join reorder. 

Since the complexity for cost base optimization grows exponentially when the 
plan is complex. By separating the optimization can make it more efficient and 
easier to maintain.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to