[ 
https://issues.apache.org/jira/browse/DRILL-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15408842#comment-15408842
 ] 

ASF GitHub Bot commented on DRILL-4825:
---------------------------------------

GitHub user jinfengni opened a pull request:

    https://github.com/apache/drill/pull/559

    DRILL-4825: Fix incorrect result issue caused by partition pruning wh…

    …en same tables are queried multiple times with different filters in query.
    
    1) Introduce DirPrunedEnumerableTableScan which will take file selection as 
part of digest.
    2) When directory-based pruning happens, create instance of 
DirPrunedEnumerableTableScan.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jinfengni/incubator-drill DRILL-4825

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/559.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #559
    
----
commit 79b1602933538d98c301afd621deafe8e8f4e79b
Author: Jinfeng Ni <[email protected]>
Date:   2016-08-05T00:54:30Z

    DRILL-4825: Fix incorrect result issue caused by partition pruning when 
same tables are queried multiple times with different filters in query.
    
    1) Introduce DirPrunedEnumerableTableScan which will take file selection as 
part of digest.
    2) When directory-based pruning happens, create instance of 
DirPrunedEnumerableTableScan.

----


> Wrong data with UNION ALL when querying different sub-directories under the 
> same table
> --------------------------------------------------------------------------------------
>
>                 Key: DRILL-4825
>                 URL: https://issues.apache.org/jira/browse/DRILL-4825
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>    Affects Versions: 1.6.0, 1.7.0, 1.8.0
>            Reporter: Rahul Challapalli
>            Assignee: Jinfeng Ni
>            Priority: Critical
>             Fix For: 1.8.0
>
>         Attachments: l_3level.tgz
>
>
> git.commit.id.abbrev=0700c6b
> The below query returns wrongs results 
> {code}
> select count (*) from (
>   select l_orderkey, dir0 from l_3level t1 where t1.dir0 = 1 and 
> t1.dir1='one' and t1.dir2 = '2015-7-12'
>   union all 
>   select l_orderkey, dir0 from l_3level t2 where t2.dir0 = 1 and 
> t2.dir1='two' and t2.dir2 = '2015-8-12') data;
> +---------+
> | EXPR$0  |
> +---------+
> | 20      |
> +---------+
> {code}
> The wrong result is evident from the output of the below queries
> {code}
> 0: jdbc:drill:zk=10.10.100.190:5181> select count (*) from (select 
> l_orderkey, dir0 from l_3level t2 where t2.dir0 = 1 and t2.dir1='two' and 
> t2.dir2 = '2015-8-12');
> +---------+
> | EXPR$0  |
> +---------+
> | 30      |
> +---------+
> 1 row selected (0.258 seconds)
> 0: jdbc:drill:zk=10.10.100.190:5181> select count (*) from (select 
> l_orderkey, dir0 from l_3level t2 where t2.dir0 = 1 and t2.dir1='one' and 
> t2.dir2 = '2015-7-12');
> +---------+
> | EXPR$0  |
> +---------+
> | 10      |
> +---------+
> {code}
> I attached the data set. Let me know if you need anything more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to