[
https://issues.apache.org/jira/browse/DRILL-2761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jinfeng Ni resolved DRILL-2761.
-------------------------------
Resolution: Fixed
commit id: e462d14e63e4b396935f611cba5183c6f5d62a8f
> ParquetGroupScan copy constructor only copy reference, leading to out-sync
> ParquetGroupScan instance.
> -----------------------------------------------------------------------------------------------------
>
> Key: DRILL-2761
> URL: https://issues.apache.org/jira/browse/DRILL-2761
> Project: Apache Drill
> Issue Type: Bug
> Reporter: Jinfeng Ni
> Assignee: Jinfeng Ni
> Attachments:
> 0003-DRILL-2761-ParquetGroupScan-s-copy-constructor-shoul.patch
>
>
> ParquetGroupScan has one copy constructor, which will be used in project
> pushdown rule and partition pruning rule to clone a modified version of
> original ParquetGroupScan instance. However, the copy constructor only copy
> the reference to several Collections, this means that if the cloned instance
> modify those collections, it will also modify the contents of the collections
> in the original ParquetGroupScan instance, leading to an invalid status for
> the original ParquetGroupScan instance. Such invalid status would lead
> incorrect query result.
> For instance, consider query:
> {code}
> select O_ORDERKEY,O_CUSTKEY,O_CLERK,O_COMMENT,dir0
> from `/drill/testdata/partition_pruning/dfs/orders`
> where (dir0=1993)
> {code}
> Assume the data is partitioned with year (1993, 1994, 1995). Depending on the
> order of RelOptRule's firing, a ParquetGroupScan could have out-sync of
> "rowGroupInfos" list and "entries" list, this will make optimizer thinks that
> the partition filter is pushed, such that "entries" is modified and filter is
> removed from the plan, yet the "rowGroupInfors" is still in the original one.
> This will make the query return unwanted rows back.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)