Rahul Challapalli created DRILL-3966:
----------------------------------------
Summary: Metadata Cache + Partition Pruning not hapenning when the
partition column is of type boolean
Key: DRILL-3966
URL: https://issues.apache.org/jira/browse/DRILL-3966
Project: Apache Drill
Issue Type: Bug
Components: Metadata, Query Planning & Optimization
Reporter: Rahul Challapalli
git.commit.id.abbrev=19b4b79
I have partitioned parquet files whose partition column is of type boolean.
The below plan suggests that pruning did not take place when partitioned column
is of type boolean and when metadata exists. However if I get rid of the
metadata cache, partition pruning seems to be working fine.
Query :
{code}
explain plan for select * from fewtypes_boolpartition where bool_col = false;
00-00 Screen
00-01 Project(*=[$0])
00-02 Project(T11¦¦*=[$0])
00-03 SelectionVectorRemover
00-04 Filter(condition=[=($1, false)])
00-05 Project(T11¦¦*=[$0], bool_col=[$1])
00-06 Scan(groupscan=[ParquetGroupScan
[entries=[ReadEntryWithPath
[path=maprfs:///drill/testdata/metadata_caching/fewtypes_boolpartition/0_0_2.parquet],
ReadEntryWithPath
[path=maprfs:///drill/testdata/metadata_caching/fewtypes_boolpartition/0_0_1.parquet]],
selectionRoot=/drill/testdata/metadata_caching/fewtypes_boolpartition,
numFiles=2, usedMetadataFile=true, columns=[`*`]]])
{code}
Error from the log :
{code}
WARN o.a.d.e.p.l.partition.PruneScanRule - Exception while trying to prune
partition.
java.lang.UnsupportedOperationException: Unsupported type: BIT
at
org.apache.drill.exec.store.parquet.ParquetGroupScan.populatePruningVector(ParquetGroupScan.java:451)
~[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
at
org.apache.drill.exec.planner.ParquetPartitionDescriptor.populatePartitionVectors(ParquetPartitionDescriptor.java:96)
~[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
at
org.apache.drill.exec.planner.logical.partition.PruneScanRule.doOnMatch(PruneScanRule.java:212)
~[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
at
org.apache.drill.exec.planner.logical.partition.ParquetPruneScanRule$2.onMatch(ParquetPruneScanRule.java:87)
[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
at
org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228)
[calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
at
org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:808)
[calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
at
org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:303)
[calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
at
org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl.java:303)
[calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
at
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.logicalPlanningVolcanoAndLopt(DefaultSqlHandler.java:545)
[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
at
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:213)
[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
at
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:248)
[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
at
org.apache.drill.exec.planner.sql.handlers.ExplainHandler.getPlan(ExplainHandler.java:61)
[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
at
org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:178)
[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:905)
[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:244)
[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
[na:1.7.0_71]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[na:1.7.0_71]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
{code}
I attached the data sets required. Let me know if you need anything
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)