[
https://issues.apache.org/jira/browse/DRILL-3560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14652831#comment-14652831
]
Aman Sinha commented on DRILL-3560:
-----------------------------------
This works for me.. so I am not sure what was the scenario that you tried. Let
me describe the test that I ran:
I have attached the table data that I used. At the top level, it has 3
directories: 1994, 1995, 1996. Each of these has 4 subdirectories:
Q1, Q2, Q3, Q4. There is 1 data file within each of these subdirectories.
Here's the EXPLAIN plan for the query:
{code}
0: jdbc:drill:zk=local> explain plan for select * from
dfs.`data/multilevel/parquet` where dir0 = '1995';
+------+------+
| text | json |
+------+------+
| 00-00 Screen
00-01 Project(*=[$0])
00-02 Project(*=[$0])
00-03 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath
[path=file:/Users/asinha/data/multilevel/parquet/1995/Q1/orders_95_q1.parquet],
ReadEntryWithPath
[path=file:/Users/asinha/data/multilevel/parquet/1995/Q2/orders_95_q2.parquet],
ReadEntryWithPath
[path=file:/Users/asinha/data/multilevel/parquet/1995/Q3/orders_95_q3.parquet],
ReadEntryWithPath
[path=file:/Users/asinha/data/multilevel/parquet/1995/Q4/orders_95_q4.parquet]],
selectionRoot=file:/Users/asinha/data/multilevel/parquet, numFiles=4,
columns=[`*`]]])
{code}
Note that the Scan only shows 1995 directory, not the other years.
Can you provide more details of your test case ? Thanks.
> Make partition pruning work for directory queries
> -------------------------------------------------
>
> Key: DRILL-3560
> URL: https://issues.apache.org/jira/browse/DRILL-3560
> Project: Apache Drill
> Issue Type: New Feature
> Components: Query Planning & Optimization
> Affects Versions: 1.1.0
> Reporter: Stefán Baxter
> Assignee: Aman Sinha
>
> Currently queries that include directory conditions are not optimized at all
> and the directory expression (dir0 = 'something') is evaluated for every
> record of every file for every directory.
> This could be optimized to fail directories and allow for the same kind of
> partition pruning for directories as for other scenarios where data has been
> partitioned.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)