[jira] [Commented] (FLINK-5859) support partition pruning on Table API & SQL

Kurt Young (JIRA) Thu, 23 Feb 2017 17:34:09 -0800

    [ 
https://issues.apache.org/jira/browse/FLINK-5859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15881737#comment-15881737
 ]


Kurt Young commented on FLINK-5859:
-----------------------------------

Hi [~fhueske],

How about this approach:
We both provide {{FilterableTableSource}} and {{PartitionableTableSource}}, 
keep {{FilterableTableSource}} as it is, and add methods like 
{{getAllPartitions}} and {{applyPartitionPruning}} to 
{{PartitionableTableSource}}. From a developer's point of view, we can treat 
these two traits completely independent. It will be easier for a developer to 
implement each functionality independently in comparing with mixing all the 
logic into the {{FilterableTableSource. setPredicate()}}. Also in the future, i 
think it will be very likely that these two traits will be applied by framework 
in different optimization stage. We apply the partition pruning as early as 
possible in the logical optimization and let filter pushdown been applied a 
little bit later because it should do some heavy weighted physical level 
analysis first. 
BTW, this approach still can achieve the approach you suggested, you can 
implement {{FilterableTableSource}} only and do all the pruning and filtering 
if you like. 

> support partition pruning on Table API & SQL
> --------------------------------------------
>
>                 Key: FLINK-5859
>                 URL: https://issues.apache.org/jira/browse/FLINK-5859
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table API & SQL
>            Reporter: godfrey he
>            Assignee: godfrey he
>
> Many data sources are partitionable storage, e.g. HDFS, Druid. And many 
> queries just need to read a small subset of the total data. We can use 
> partition information to prune or skip over files irrelevant to the user’s 
> queries. Both query optimization time and execution time can be reduced 
> obviously, especially for a large partitioned table.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-5859) support partition pruning on Table API & SQL

Reply via email to