[
https://issues.apache.org/jira/browse/FLINK-5859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15881737#comment-15881737
]
Kurt Young commented on FLINK-5859:
-----------------------------------
Hi [~fhueske],
How about this approach:
We both provide {{FilterableTableSource}} and {{PartitionableTableSource}},
keep {{FilterableTableSource}} as it is, and add methods like
{{getAllPartitions}} and {{applyPartitionPruning}} to
{{PartitionableTableSource}}. From a developer's point of view, we can treat
these two traits completely independent. It will be easier for a developer to
implement each functionality independently in comparing with mixing all the
logic into the {{FilterableTableSource. setPredicate()}}. Also in the future, i
think it will be very likely that these two traits will be applied by framework
in different optimization stage. We apply the partition pruning as early as
possible in the logical optimization and let filter pushdown been applied a
little bit later because it should do some heavy weighted physical level
analysis first.
BTW, this approach still can achieve the approach you suggested, you can
implement {{FilterableTableSource}} only and do all the pruning and filtering
if you like.
> support partition pruning on Table API & SQL
> --------------------------------------------
>
> Key: FLINK-5859
> URL: https://issues.apache.org/jira/browse/FLINK-5859
> Project: Flink
> Issue Type: New Feature
> Components: Table API & SQL
> Reporter: godfrey he
> Assignee: godfrey he
>
> Many data sources are partitionable storage, e.g. HDFS, Druid. And many
> queries just need to read a small subset of the total data. We can use
> partition information to prune or skip over files irrelevant to the user’s
> queries. Both query optimization time and execution time can be reduced
> obviously, especially for a large partitioned table.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)