Khurram Faraaz created DRILL-3538:
-------------------------------------
Summary: We do not prune partitions when we count over
partitioning key and filter over partitioning key
Key: DRILL-3538
URL: https://issues.apache.org/jira/browse/DRILL-3538
Project: Apache Drill
Issue Type: Bug
Components: Execution - Flow
Affects Versions: 1.2.0
Environment: 4 node cluster on CentOS
Reporter: Khurram Faraaz
Assignee: Chris Westin
We are not partition pruning when we do a count over partitioning key and when
the predicate involves the partitioning key. CTAS used was,
{code}
create table t3214 partition by (key2) as select cast(key1 as double) key1,
cast(key2 as char(1)) key2 from `twoKeyJsn.json`;
{code}
case 1) We do not do partition pruning in this case.
{code}
0: jdbc:drill:schema=dfs.tmp> explain plan for select count(key2) from t3214
where key2 = 'm';
+------+------+
| text | json |
+------+------+
| 00-00 Screen
00-01 Project(EXPR$0=[$0])
00-02 Project(EXPR$0=[$0])
00-03
Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@e2471d7])
{code}
case 2) We do not do partition pruning in this case.
{code}
0: jdbc:drill:schema=dfs.tmp> explain plan for select count(*) from t3214 where
key2 = 'm';
+------+------+
| text | json |
+------+------+
| 00-00 Screen
00-01 Project(EXPR$0=[$0])
00-02 Project(EXPR$0=[$0])
00-03
Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@211930a2])
{code}
case 3) We do not do partition pruning in this case.
{code}
0: jdbc:drill:schema=dfs.tmp> explain plan for select count(key1) from t3214
where key2 = 'm';
+------+------+
| text | json |
+------+------+
| 00-00 Screen
00-01 Project(EXPR$0=[$0])
00-02 Project(EXPR$0=[$0])
00-03
Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@23fea3b0])
{code}
case 4) we do prune here.
{code}
0: jdbc:drill:schema=dfs.tmp> explain plan for select avg(key1) from t3214
where key2 = 'm';
+------+------+
| text | json |
+------+------+
| 00-00 Screen
00-01 Project(EXPR$0=[CAST(/(CastHigh(CASE(=($1, 0), null, $0)), $1)):ANY
NOT NULL])
00-02 StreamAgg(group=[{}], agg#0=[$SUM0($0)], agg#1=[$SUM0($1)])
00-03 StreamAgg(group=[{}], agg#0=[$SUM0($0)], agg#1=[COUNT($0)])
00-04 Project(key1=[$1])
00-05 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath
[path=/tmp/t3214/0_0_15.parquet]], selectionRoot=maprfs:/tmp/t3214, numFiles=1,
columns=[`key2`, `key1`]]])
{code}
case 5) we do prune here.
{code}
0: jdbc:drill:schema=dfs.tmp> explain plan for select min(key1) from t3214
where key2 = 'm';
+------+------+
| text | json |
+------+------+
| 00-00 Screen
00-01 Project(EXPR$0=[$0])
00-02 StreamAgg(group=[{}], EXPR$0=[MIN($0)])
00-03 StreamAgg(group=[{}], EXPR$0=[MIN($0)])
00-04 Project(key1=[$1])
00-05 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath
[path=/tmp/t3214/0_0_15.parquet]], selectionRoot=maprfs:/tmp/t3214, numFiles=1,
columns=[`key2`, `key1`]]])
{code}
commit id that I am testing on : 17e580a7
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)