[
https://issues.apache.org/jira/browse/DRILL-2517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15106987#comment-15106987
]
Jinfeng Ni commented on DRILL-2517:
-----------------------------------
For now, the patch will apply directory-base pruning in both Calcite logical
and Drill logical Rel convention. This is caused by the fact that Filter push
down is happening later on (Currently it's happening in VocanoPlanner)
We are planning to separate the filter pushdown and project pushdown logic.
This would be addressed in https://issues.apache.org/jira/browse/DRILL-3996.
The initial prototype shows quite lots of regression to existing workload.
Therefore, seems it might take considerably big effort to get DRILL-3996 fixed.
> Apply Partition pruning before reading files during planning
> ------------------------------------------------------------
>
> Key: DRILL-2517
> URL: https://issues.apache.org/jira/browse/DRILL-2517
> Project: Apache Drill
> Issue Type: New Feature
> Components: Query Planning & Optimization
> Affects Versions: 0.7.0, 0.8.0
> Reporter: Adam Gilmore
> Assignee: Jinfeng Ni
> Fix For: Future
>
>
> Partition pruning still tries to read Parquet files during the planning stage
> even though they don't match the partition filter.
> For example, if there were an invalid Parquet file in a directory that should
> not be queried:
> {code}
> 0: jdbc:drill:zk=local> select sum(price) from dfs.tmp.purchases where dir0 =
> 1;
> Query failed: IllegalArgumentException: file:/tmp/purchases/4/0_0_0.parquet
> is not a Parquet file (too small)
> {code}
> The reason is that the partition pruning happens after the Parquet plugin
> tries to read the footer of each file.
> Ideally, partition pruning would happen first before the format plugin gets
> involved.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)