[
https://issues.apache.org/jira/browse/DRILL-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15408845#comment-15408845
]
Jinfeng Ni commented on DRILL-4825:
-----------------------------------
The cause of this problem:
EnumerableTableScan's digest only contains table name/rowtype. After
dir-based partition pruning, we got two EnumerableTableScan, each has
DrillTable with different file selection. Those two EnumerableTableScan
instances have same digests. It works fine for HepPlanner, but not for
VolcanoPlanner, which will treat them as identical. That's why after
VolcanoPlanner for drill logical planning, we end up with the same TableScan.
That's why we got the incorrect plan and query result.
> Wrong data with UNION ALL when querying different sub-directories under the
> same table
> --------------------------------------------------------------------------------------
>
> Key: DRILL-4825
> URL: https://issues.apache.org/jira/browse/DRILL-4825
> Project: Apache Drill
> Issue Type: Bug
> Components: Query Planning & Optimization
> Affects Versions: 1.6.0, 1.7.0, 1.8.0
> Reporter: Rahul Challapalli
> Assignee: Jinfeng Ni
> Priority: Critical
> Fix For: 1.8.0
>
> Attachments: l_3level.tgz
>
>
> git.commit.id.abbrev=0700c6b
> The below query returns wrongs results
> {code}
> select count (*) from (
> select l_orderkey, dir0 from l_3level t1 where t1.dir0 = 1 and
> t1.dir1='one' and t1.dir2 = '2015-7-12'
> union all
> select l_orderkey, dir0 from l_3level t2 where t2.dir0 = 1 and
> t2.dir1='two' and t2.dir2 = '2015-8-12') data;
> +---------+
> | EXPR$0 |
> +---------+
> | 20 |
> +---------+
> {code}
> The wrong result is evident from the output of the below queries
> {code}
> 0: jdbc:drill:zk=10.10.100.190:5181> select count (*) from (select
> l_orderkey, dir0 from l_3level t2 where t2.dir0 = 1 and t2.dir1='two' and
> t2.dir2 = '2015-8-12');
> +---------+
> | EXPR$0 |
> +---------+
> | 30 |
> +---------+
> 1 row selected (0.258 seconds)
> 0: jdbc:drill:zk=10.10.100.190:5181> select count (*) from (select
> l_orderkey, dir0 from l_3level t2 where t2.dir0 = 1 and t2.dir1='one' and
> t2.dir2 = '2015-7-12');
> +---------+
> | EXPR$0 |
> +---------+
> | 10 |
> +---------+
> {code}
> I attached the data set. Let me know if you need anything more
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)