[
https://issues.apache.org/jira/browse/DRILL-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14908160#comment-14908160
]
Aman Sinha commented on DRILL-3830:
-----------------------------------
Let's also run a query with regular aggregates on both systems for the 1000
scale factor:
{code}
SELECT SUM(ss.ss_net_paid_inc_tax) as sum1 FROM store_sales GROUP BY
ss.ss_store_sk ORDER BY sum1 desc LIMIT 10;
{code}
This will confirm whether the differences are more generic or specific to WF.
Also repeat the runs with MIN and MAX for sanity checks.
> Query with aggregate window functions returns possibly wrong results on large
> scale data
> ----------------------------------------------------------------------------------------
>
> Key: DRILL-3830
> URL: https://issues.apache.org/jira/browse/DRILL-3830
> Project: Apache Drill
> Issue Type: Bug
> Components: Execution - Relational Operators
> Affects Versions: 1.2.0
> Environment: 10 Performance Nodes
> DRILL_MAX_DIRECT_MEMORY=100g
> DRILL_INIT_HEAP="8g"
> DRILL_MAX_HEAP="8g"
> planner.memory.query_max_memory_per_node bumped up to 20 GB
> TPC-DS SF 1000 dataset (Parquet)
> Reporter: Abhishek Girish
> Assignee: Deneche A. Hakim
> Attachments: drill_sf1_plan.txt, gpdb_sf1000_plan.txt,
> gpdb_sf1_plan.txt
>
>
> Results returned by the following two queries slightly differ from those
> returned by Greenplum DB.
> {code:sql}
> SELECT SUM(ss.ss_net_paid_inc_tax) OVER (PARTITION BY ss.ss_store_sk) FROM
> store_sales ss LIMIT 1;
> SELECT SUM(ss.ss_net_paid_inc_tax) OVER (PARTITION BY ss.ss_store_sk ORDER BY
> ss.ss_store_sk) FROM store_sales ss LIMIT 2;
> Drill:
> 9.653697131700665E9
> Greenplum DB:
> 9.628946925860903E9
> P.S. Both queries return same results
> {code}
> I was unable to reproduce this on smaller scale (tried SF 1). I'll attach
> plans from both systems.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)