Aman Sinha created DRILL-2568:
---------------------------------

             Summary: New partition pruning prevents the optimization for 
trivial COUNT(*) queries
                 Key: DRILL-2568
                 URL: https://issues.apache.org/jira/browse/DRILL-2568
             Project: Apache Drill
          Issue Type: Bug
          Components: Query Planning & Optimization
    Affects Versions: 0.8.0
            Reporter: Aman Sinha
            Assignee: Aman Sinha


With the new interpreter based partition pruning,  if the query has only 
partition filters and they are pushed into the Scan, we don't drop the Filter 
node from the plan. This prevents the optimization for COUNT(*) queries against 
parquet files where we read the count values directly from the parquet files 
instead of scanning and aggregating.  The ConvertCountToDirectScan rule does 
not get applied if there is an intervening Filter between the Scan and the 
Aggregate nodes.  

{code}
0: jdbc:drill:zk=local> explain plan for select count(*) from 
dfs.`/Users/asinha/data/multilevel/parquet` where dir0=1995;
+------------+------------+
|    text    |    json    |
+------------+------------+
| 00-00    Screen
00-01      StreamAgg(group=[{}], EXPR$0=[COUNT()])
00-02        Project($f0=[0])
00-03          SelectionVectorRemover
00-04            Filter(condition=[=($0, 1995)])
00-05              Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
[path=file:/Users/asinha/data/multilevel/parquet/1995/Q1/orders_95_q1.parquet], 
ReadEntryWithPath 
[path=file:/Users/asinha/data/multilevel/parquet/1995/Q2/orders_95_q2.parquet], 
ReadEntryWithPath 
[path=file:/Users/asinha/data/multilevel/parquet/1995/Q3/orders_95_q3.parquet], 
ReadEntryWithPath 
[path=file:/Users/asinha/data/multilevel/parquet/1995/Q4/orders_95_q4.parquet]],
 selectionRoot=/Users/asinha/data/multilevel/parquet, numFiles=4, 
columns=[`dir0`]]])
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to