Fabian Hueske created FLINK-8950:
------------------------------------

             Summary: "Materialize" Tables to avoid recomputation.
                 Key: FLINK-8950
                 URL: https://issues.apache.org/jira/browse/FLINK-8950
             Project: Flink
          Issue Type: New Feature
          Components: Table API & SQL
    Affects Versions: 1.5.0
            Reporter: Fabian Hueske


Currently, {{Table}} objects of the Table API / SQL are treated like virtual 
views, i.e., all relational operators that have been applied on them are 
recorded and translated when a {{Table}} is emitted to a {{TableSink}} or 
converted into a {{DataSet}} or {{DataStream}}.

In case a {{Table}} is accessed twice, the (sub-)query that it represents is 
translated twice into a {{DataSet}} or {{DataStream}} program and hence also 
executed twice which is inefficient. Currently, the only way to avoid this is 
to convert the {{Table}} into a {{DataSet}} or {{DataStream}}, which will cause 
the optimizer to generate a plan and register it back as a new {{Table}}.

We should offer a method to internally "materialize" a {{Table}} object, i.e., 
to optimize, generate a plan, and register the plan as an internal table. All 
queries / operations that are evaluated on the materialized {{Table}} will 
start from the same {{DataSet}} or {{DataStream}} such that it is not computed 
multiple times.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to