Abhishek Girish created DRILL-3830:
--------------------------------------

             Summary: Query with aggregate window functions returns possibly 
wrong results on large scale data
                 Key: DRILL-3830
                 URL: https://issues.apache.org/jira/browse/DRILL-3830
             Project: Apache Drill
          Issue Type: Bug
          Components: Execution - Relational Operators
    Affects Versions: 1.2.0
         Environment: 10 Performance Nodes
DRILL_MAX_DIRECT_MEMORY=100g
DRILL_INIT_HEAP="8g"
DRILL_MAX_HEAP="8g"
planner.memory.query_max_memory_per_node bumped up to 20 GB
TPC-DS SF 1000 dataset (Parquet)
            Reporter: Abhishek Girish
            Assignee: Deneche A. Hakim


Results returned by the following two queries slightly differ from those 
returned  by Greenplum DB. 

{code:sql}
SELECT SUM(ss.ss_net_paid_inc_tax) OVER (PARTITION BY ss.ss_store_sk) FROM 
store_sales ss LIMIT 1;

SELECT SUM(ss.ss_net_paid_inc_tax) OVER (PARTITION BY ss.ss_store_sk ORDER BY 
ss.ss_store_sk) FROM store_sales ss LIMIT 2;

Drill:
9.653697131700665E9

Greenplum DB:
9.628946925860903E9

P.S. Both queries return same results
{code}

I was unable to reproduce this on smaller scale (tried SF 1). I'll attach plans 
from both systems. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to