Aman Sinha created DRILL-2568:
---------------------------------
Summary: New partition pruning prevents the optimization for
trivial COUNT(*) queries
Key: DRILL-2568
URL: https://issues.apache.org/jira/browse/DRILL-2568
Project: Apache Drill
Issue Type: Bug
Components: Query Planning & Optimization
Affects Versions: 0.8.0
Reporter: Aman Sinha
Assignee: Aman Sinha
With the new interpreter based partition pruning, if the query has only
partition filters and they are pushed into the Scan, we don't drop the Filter
node from the plan. This prevents the optimization for COUNT(*) queries against
parquet files where we read the count values directly from the parquet files
instead of scanning and aggregating. The ConvertCountToDirectScan rule does
not get applied if there is an intervening Filter between the Scan and the
Aggregate nodes.
{code}
0: jdbc:drill:zk=local> explain plan for select count(*) from
dfs.`/Users/asinha/data/multilevel/parquet` where dir0=1995;
+------------+------------+
| text | json |
+------------+------------+
| 00-00 Screen
00-01 StreamAgg(group=[{}], EXPR$0=[COUNT()])
00-02 Project($f0=[0])
00-03 SelectionVectorRemover
00-04 Filter(condition=[=($0, 1995)])
00-05 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath
[path=file:/Users/asinha/data/multilevel/parquet/1995/Q1/orders_95_q1.parquet],
ReadEntryWithPath
[path=file:/Users/asinha/data/multilevel/parquet/1995/Q2/orders_95_q2.parquet],
ReadEntryWithPath
[path=file:/Users/asinha/data/multilevel/parquet/1995/Q3/orders_95_q3.parquet],
ReadEntryWithPath
[path=file:/Users/asinha/data/multilevel/parquet/1995/Q4/orders_95_q4.parquet]],
selectionRoot=/Users/asinha/data/multilevel/parquet, numFiles=4,
columns=[`dir0`]]])
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)