Rahul Challapalli created DRILL-3410:
----------------------------------------
Summary: Partition Pruning : We are doing a prune when we shouldn't
Key: DRILL-3410
URL: https://issues.apache.org/jira/browse/DRILL-3410
Project: Apache Drill
Issue Type: Bug
Components: Query Planning & Optimization
Reporter: Rahul Challapalli
Assignee: Jinfeng Ni
Priority: Critical
Fix For: 1.1.0
git.commit.id.abbrev=60bc945
The below plan does not look right. It should scan all the files based on the
filters in the query. Also hive returned more rows than drill
{code}
explain plan for select * from `existing_partition_pruning/lineitempart` where
(dir0=1993 and columns[0] >29600) or (dir0=1994 or columns[0]>29700);
| 00-00 Screen
00-01 Project(*=[$0])
00-02 Project(T70¦¦*=[$0])
00-03 SelectionVectorRemover
00-04 Filter(condition=[OR(AND(=($1, 1993), >(ITEM($2, 0), 29600)),
=($1, 1994), >(ITEM($2, 0), 29700))])
00-05 Project(T70¦¦*=[$0], dir0=[$1], columns=[$2])
00-06 Scan(groupscan=[ParquetGroupScan
[entries=[ReadEntryWithPath
[path=/drill/testdata/ctas_auto_partition/existing_partition_pruning/lineitempart/0_0_3.parquet],
ReadEntryWithPath
[path=/drill/testdata/ctas_auto_partition/existing_partition_pruning/lineitempart/0_0_4.parquet]],
selectionRoot=/drill/testdata/ctas_auto_partition/existing_partition_pruning/lineitempart,
numFiles=2, columns=[`*`]]])
|
{code}
I attached the data set used. Let me know if you need anything more
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)