[
https://issues.apache.org/jira/browse/DRILL-5374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15935591#comment-15935591
]
Robert Hou commented on DRILL-5374:
-----------------------------------
This is the Scan step from the explain plan:
{code}
00-06 Scan(groupscan=[ParquetGroupScan
[entries=[ReadEntryWithPath
[path=/drill/testdata/filter/orders_parts_metadata/0_0_1.parquet],
ReadEntryWithPath
[path=/drill/testdata/filter/orders_parts_metadata/0_0_4.parquet],
ReadEntryWithPath
[path=/drill/testdata/filter/orders_parts_metadata/0_0_2.parquet]],
selectionRoot=/drill/testdata/filter/orders_parts_metadata, numFiles=3,
usedMetadataFile=true,
cacheFileRoot=/drill/testdata/filter/orders_parts_metadata,
columns=[`float_id`]]])
{code}
Partition /drill/testdata/filter/orders_parts_metadata/0_0_4.parquet should not
be scanned because it contains all null values for the float_id column.
> Parquet filter pushdown does not prune partition with nulls when predicate
> uses float column
> --------------------------------------------------------------------------------------------
>
> Key: DRILL-5374
> URL: https://issues.apache.org/jira/browse/DRILL-5374
> Project: Apache Drill
> Issue Type: Bug
> Components: Query Planning & Optimization
> Affects Versions: 1.9.0
> Reporter: Robert Hou
> Assignee: Jinfeng Ni
> Attachments: 0_0_1.parquet, 0_0_2.parquet, 0_0_3.parquet,
> 0_0_4.parquet, 0_0_5.parquet, drill.parquet_metadata
>
>
> Drill does not prune enough partitions for this query when filter pushdown is
> used with metadata caching. The float column is being compared with a double
> value.
> {code}
> 0: jdbc:drill:zk=10.10.100.186:5181/drill/rho> select count(*) from
> orders_parts_metadata where float_id < 1100.0;
> {code}
> To reproduce the problem, put the attached files into a directory. Then
> {code}
> create the metadata:
> refresh table metadata dfs.`path_to_directory`;
> {code}
> For example, if you put the files in
> /drill/testdata/filter/orders_parts_metadata, then run this sql command
> {code}
> refresh table metadata dfs.`/drill/testdata/filter/orders_parts_metadata`;
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)