[
https://issues.apache.org/jira/browse/DRILL-5093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15713469#comment-15713469
]
Robert Hou commented on DRILL-5093:
-----------------------------------
The two tables have the same data.
> Explain plan shows all partitions when query scans all partitions, and filter
> pushdown is used with metadata caching.
> ---------------------------------------------------------------------------------------------------------------------
>
> Key: DRILL-5093
> URL: https://issues.apache.org/jira/browse/DRILL-5093
> Project: Apache Drill
> Issue Type: Bug
> Components: Query Planning & Optimization
> Affects Versions: 1.9.0
> Reporter: Robert Hou
> Assignee: Jinfeng Ni
> Attachments: 0_0_1.parquet, 0_0_2.parquet, 0_0_3.parquet,
> 0_0_4.parquet, 0_0_5.parquet, drill.parquet_metadata
>
>
> This query scans all the partitions because the partitions cannot be pruned.
> When metadata caching is used, the explain plan shows all the partitions,
> when it should only show the parent.
> 0: jdbc:drill:zk=10.10.100.186:5181/drill/rho> explain plan for select \*
> from orders_parts_metadata;
> +------+------+
> | text | json |
> +------+------+
> | 00-00 Screen
> 00-01 Project(*=[$0])
> 00-02 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath
> [path=/drill/testdata/filter/orders_parts_metadata/0_0_1.parquet],
> ReadEntryWithPath
> [path=/drill/testdata/filter/orders_parts_metadata/0_0_3.parquet],
> ReadEntryWithPath
> [path=/drill/testdata/filter/orders_parts_metadata/0_0_4.parquet],
> ReadEntryWithPath
> [path=/drill/testdata/filter/orders_parts_metadata/0_0_5.parquet],
> ReadEntryWithPath
> [path=/drill/testdata/filter/orders_parts_metadata/0_0_2.parquet]],
> selectionRoot=/drill/testdata/filter/orders_parts_metadata, numFiles=5,
> usedMetadataFile=true,
> cacheFileRoot=/drill/testdata/filter/orders_parts_metadata, columns=[`*`]]])
> To reproduce the problem, put the attached files into a directory. Then
> create the metadata:
> refresh table metadata dfs.`path_to_directory`;
> For example, if you put the files in
> /drill/testdata/filter/orders_parts_metadata, then run this sql command
> refresh table metadata dfs.`/drill/testdata/filter/orders_parts_metadata`;
> Here is the same query with a table that does not have metadata caching.
> 0: jdbc:drill:zk=10.10.100.186:5181/drill/rho> explain plan for select \*
> from orders_parts;
> +------+------+
> | text | json |
> +------+------+
> | 00-00 Screen
> 00-01 Project(*=[$0])
> 00-02 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath
> [path=maprfs:///drill/testdata/filter/orders_parts]],
> selectionRoot=maprfs:/drill/testdata/filter/orders_parts, numFiles=1,
> usedMetadataFile=false, columns=[`*`]]])
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)