[
https://issues.apache.org/jira/browse/FLINK-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238958#comment-15238958
]
Fabian Hueske commented on FLINK-3738:
--------------------------------------
Hi [~yijieshen], that's a good observation. I would suggest to open a new JIRA
for this issue. FLINK-3632 is somewhat related to this as well.
In general, it would be good to validate as early as possible, ideally when the
RelNodes are constructed. This is not always possible with the current Table
API. For instance, joins are defined by {{join()}} and join predicates are
later added with {{where()}}. ATM, we do only allow equality joins for
performance reasons but this can only be checked after optimization and when
the DataSet program is constructed.
However, I think it should be possible to move more checks to the API level.
So, it would be good if you could open a JIRA (maybe with FLINK-3632 as a
related or subissue) to refactor the query validation.
> Refactor TableEnvironment and TranslationContext
> ------------------------------------------------
>
> Key: FLINK-3738
> URL: https://issues.apache.org/jira/browse/FLINK-3738
> Project: Flink
> Issue Type: Task
> Components: Table API
> Reporter: Fabian Hueske
> Assignee: Fabian Hueske
>
> Currently the TableAPI uses a static object called {{TranslationContext}}
> which holds the Calcite table catalog and a Calcite planner instance.
> Whenever a {{DataSet}} or {{DataStream}} is converted into a {{Table}} or
> registered as a {{Table}} on the {{TableEnvironment}}, a new entry is added
> to the catalog. The first time a {{Table}} is added, a planner instance is
> created. The planner is used to optimize the query (defined by one or more
> Table API operations and/or one ore more SQL queries) when a {{Table}} is
> converted into a {{DataSet}} or {{DataStream}}. Since a planner may only be
> used to optimize a single program, the choice of a single static object is
> problematic.
> I propose to refactor the {{TableEnvironment}} to take over the
> responsibility of holding the catalog and the planner instance.
> - A {{TableEnvironment}} holds a catalog of registered tables and a single
> planner instance.
> - A {{TableEnvironment}} will only allow to translate a single {{Table}}
> (possibly composed of several Table API operations and SQL queries) into a
> {{DataSet}} or {{DataStream}}.
> - A {{TableEnvironment}} is bound to an {{ExecutionEnvironment}} or a
> {{StreamExecutionEnvironment}}. This is necessary to create data source or
> source functions to read external tables or streams.
> - {{DataSet}} and {{DataStream}} need a reference to a {{TableEnvironment}}
> to be converted into a {{Table}}. This will prohibit implicit casts as
> currently supported for the DataSet Scala API.
> - A {{Table}} needs a reference to the {{TableEnvironment}} it is bound to.
> Only tables from the same {{TableEnvironment}} can be processed together.
> - The {{TranslationContext}} will be completely removed.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)