[ 
https://issues.apache.org/jira/browse/DRILL-4590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15230493#comment-15230493
 ] 

Aman Sinha commented on DRILL-4590:
-----------------------------------

Based on an initial look, this seems to be an issue with decorrelation of the 
correlated subquery.  In the query against view, the following group-by is 
being done on {{c_custkey}} even though this column is not  referenced anywhere 
in the query:
{noformat}
00-15 HashAgg(group=[{0}])
00-17 Project(c_custkey=[$0])
{noformat}

This is wrong column to do the group-by..it should be {{c_nationkey}} which is 
the correlation column. 

> TPC-H q17 returns wrong results when applied to views
> -----------------------------------------------------
>
>                 Key: DRILL-4590
>                 URL: https://issues.apache.org/jira/browse/DRILL-4590
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Functions - Drill
>    Affects Versions: 1.6.0
>         Environment: RHEL 6.4  2.6.32-358.el6.x86_64
>            Reporter: Dechang Gu
>            Priority: Critical
>
> When run tpch queries on views created from parquet tables, query 17 returned 
> wrong results: 
> [root@ucs-node1 bugs]# /opt/mapr/drill/drill-1.6.0/bin/sqlline -u 
> "jdbc:drill:schema=dfs.tpchViews" -f /tmp/TPCH_17.sql 
> 1/1          select 
> sum(l.l_extendedprice) / 7.0 as avg_yearly 
> from 
> lineitem l, 
> part p 
> where 
> p.p_partkey = l.l_partkey 
> and p.p_brand = 'Brand#13' 
> and p.p_container = 'JUMBO CAN' 
> and l.l_quantity < ( 
> select 
> 0.2 * avg(l2.l_quantity) 
> from 
> lineitem l2 
> where 
> l2.l_partkey = p.p_partkey 
> );
> +---------------------+
> |     avg_yearly      |
> +---------------------+
> | 1139490.7042857148  |
> +---------------------+
> 1 row selected (20.364 seconds)
> While the same query directly on the parquet tables shows the correct results:
> [root@ucs-node1 bugs]# /opt/mapr/drill/drill-1.6.0/bin/sqlline -u 
> "jdbc:drill:schema=dfs.parquet" -f /tmp/17_par100.q 
> 1/1          select 
> sum(l.l_extendedprice) / 7.0 as avg_yearly 
> from 
> lineitem_par100 l, 
> part_par100 p 
> where 
> p.p_partkey = l.l_partkey 
> and p.p_brand = 'Brand#13' 
> and p.p_container = 'JUMBO CAN' 
> and l.l_quantity < ( 
> select 
> 0.2 * avg(l2.l_quantity) 
> from 
> lineitem_par100 l2 
> where 
> l2.l_partkey = p.p_partkey 
> );
> +----------------------+
> |      avg_yearly      |
> +----------------------+
> | 3.237333813714285E7  |
> +----------------------+
> 1 row selected (25.266 seconds)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to