[ 
https://issues.apache.org/jira/browse/DRILL-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14553168#comment-14553168
 ] 

Aman Sinha commented on DRILL-3094:
-----------------------------------

The non-determinism occurs because the view is doing a SUM of a floating point 
number and depending on the parallelism, the order in which rows are aggregated 
could be slightly different, causing a different sum value.  Comparing this sum 
value to the max(total_revenue) may not find a match on every run.    This is 
expected behavior for Drill.  Postgres is deterministic because it is not doing 
distributed execution. 

One way to resolve this could be to use round() function:   
{code}
    round(r.total_revenue, 2) = (select round(max(total_revenue, 2) ...)
{code}

> TPCH query 15 returns non-deterministic result
> ----------------------------------------------
>
>                 Key: DRILL-3094
>                 URL: https://issues.apache.org/jira/browse/DRILL-3094
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>    Affects Versions: 1.0.0
>            Reporter: Abhishek Girish
>            Assignee: Aman Sinha
>
> Query 15:
> {code:sql}
> create or replace view revenue0 (supplier_no, total_revenue) as
>   select
>     l_suppkey,
>     sum(l_extendedprice * (1 - l_discount))
>   from
>     lineitem
>   where
>     l_shipdate >= date '1993-05-01'
>     and l_shipdate < date '1993-05-01' + interval '3' month
>   group by
>     l_suppkey;
> select
>   s.s_suppkey,
>   s.s_name,
>   s.s_address,
>   s.s_phone,
>   r.total_revenue
> from
>   supplier s,
>   revenue0 r
> where
>   s.s_suppkey = r.supplier_no
>   and r.total_revenue = (
>     select
>       max(total_revenue)
>     from
>       revenue0
>   )
> order by
>   s.s_suppkey;
> {code}
> Drill sometimes returns 0 rows and other times 1 row. Postgres always returns 
> 1 row. 
> This is possibly due to the non-deterministic comparison of floating point 
> values. 
> {code}total_revenue (calculated as sum(l_extendedprice * (1 - 
> l_discount))){code} is compared with {code}max(total_revenue){code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to