Abhishek Girish created DRILL-3830:
--------------------------------------
Summary: Query with aggregate window functions returns possibly
wrong results on large scale data
Key: DRILL-3830
URL: https://issues.apache.org/jira/browse/DRILL-3830
Project: Apache Drill
Issue Type: Bug
Components: Execution - Relational Operators
Affects Versions: 1.2.0
Environment: 10 Performance Nodes
DRILL_MAX_DIRECT_MEMORY=100g
DRILL_INIT_HEAP="8g"
DRILL_MAX_HEAP="8g"
planner.memory.query_max_memory_per_node bumped up to 20 GB
TPC-DS SF 1000 dataset (Parquet)
Reporter: Abhishek Girish
Assignee: Deneche A. Hakim
Results returned by the following two queries slightly differ from those
returned by Greenplum DB.
{code:sql}
SELECT SUM(ss.ss_net_paid_inc_tax) OVER (PARTITION BY ss.ss_store_sk) FROM
store_sales ss LIMIT 1;
SELECT SUM(ss.ss_net_paid_inc_tax) OVER (PARTITION BY ss.ss_store_sk ORDER BY
ss.ss_store_sk) FROM store_sales ss LIMIT 2;
Drill:
9.653697131700665E9
Greenplum DB:
9.628946925860903E9
P.S. Both queries return same results
{code}
I was unable to reproduce this on smaller scale (tried SF 1). I'll attach plans
from both systems.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)