[jira] [Commented] (DRILL-4825) Wrong data with UNION ALL when querying different sub-directories under the same table

ASF GitHub Bot (JIRA) Thu, 04 Aug 2016 23:53:09 -0700

    [ 
https://issues.apache.org/jira/browse/DRILL-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15408999#comment-15408999
 ]


ASF GitHub Bot commented on DRILL-4825:
---------------------------------------

Github user amansinha100 commented on a diff in the pull request:

    https://github.com/apache/drill/pull/559#discussion_r73649124
  
    --- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DirPrunedEnumerableTableScan.java
 ---
    @@ -0,0 +1,73 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.drill.exec.planner.logical;
    +
    +import com.google.common.base.Supplier;
    +import com.google.common.collect.ImmutableList;
    +import org.apache.calcite.adapter.enumerable.EnumerableConvention;
    +import org.apache.calcite.adapter.enumerable.EnumerableTableScan;
    +import org.apache.calcite.plan.RelOptCluster;
    +import org.apache.calcite.plan.RelOptTable;
    +import org.apache.calcite.plan.RelTraitSet;
    +import org.apache.calcite.rel.RelCollation;
    +import org.apache.calcite.rel.RelCollationTraitDef;
    +import org.apache.calcite.rel.RelWriter;
    +import org.apache.calcite.schema.Table;
    +
    +import java.util.List;
    +
    +/**
    + * This class extends from EnumerableTableScan. It puts the file selection 
string into it's digest.
    + * When directory-based partition pruning applied, file selection could be 
different for the same
    + * table.
    + */
    +public class DirPrunedEnumerableTableScan extends EnumerableTableScan {
    +  private final String digestFromSelection;
    +
    +  public DirPrunedEnumerableTableScan(RelOptCluster cluster, RelTraitSet 
traitSet,
    --- End diff --
    
    Besides the constructor and the create() methods, there's also the copy() 
method that I think should be overridden.  Several parts of the code make a 
copy of EnumerableTableScan and they will not see the digestFromSelection. 


> Wrong data with UNION ALL when querying different sub-directories under the 
> same table
> --------------------------------------------------------------------------------------
>
>                 Key: DRILL-4825
>                 URL: https://issues.apache.org/jira/browse/DRILL-4825
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>    Affects Versions: 1.6.0, 1.7.0, 1.8.0
>            Reporter: Rahul Challapalli
>            Assignee: Jinfeng Ni
>            Priority: Critical
>             Fix For: 1.8.0
>
>         Attachments: l_3level.tgz
>
>
> git.commit.id.abbrev=0700c6b
> The below query returns wrongs results 
> {code}
> select count (*) from (
>   select l_orderkey, dir0 from l_3level t1 where t1.dir0 = 1 and 
> t1.dir1='one' and t1.dir2 = '2015-7-12'
>   union all 
>   select l_orderkey, dir0 from l_3level t2 where t2.dir0 = 1 and 
> t2.dir1='two' and t2.dir2 = '2015-8-12') data;
> +---------+
> | EXPR$0  |
> +---------+
> | 20      |
> +---------+
> {code}
> The wrong result is evident from the output of the below queries
> {code}
> 0: jdbc:drill:zk=10.10.100.190:5181> select count (*) from (select 
> l_orderkey, dir0 from l_3level t2 where t2.dir0 = 1 and t2.dir1='two' and 
> t2.dir2 = '2015-8-12');
> +---------+
> | EXPR$0  |
> +---------+
> | 30      |
> +---------+
> 1 row selected (0.258 seconds)
> 0: jdbc:drill:zk=10.10.100.190:5181> select count (*) from (select 
> l_orderkey, dir0 from l_3level t2 where t2.dir0 = 1 and t2.dir1='one' and 
> t2.dir2 = '2015-7-12');
> +---------+
> | EXPR$0  |
> +---------+
> | 10      |
> +---------+
> {code}
> I attached the data set. Let me know if you need anything more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-4825) Wrong data with UNION ALL when querying different sub-directories under the same table

Reply via email to