[
https://issues.apache.org/jira/browse/SPARK-13865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
JESSE CHEN updated SPARK-13865:
-------------------------------
Labels: tpcds-result-mismatch (was: )
> TPCDS query 87 returns wrong results compared to TPC official result set
> -------------------------------------------------------------------------
>
> Key: SPARK-13865
> URL: https://issues.apache.org/jira/browse/SPARK-13865
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.6.0
> Reporter: JESSE CHEN
> Labels: tpcds-result-mismatch
>
> Testing Spark SQL using TPC queries. Query 87 returns wrong results compared
> to official result set. This is at 1GB SF (validation run).
> SparkSQL returns count of 47555, answer set expects 47298.
> Actual results:
> [47555]
> Expected:
> +-------+
> | 1 |
> +-------+
> | 47298 |
> +-------+
> Query used:
> -- start query 87 in stream 0 using template query87.tpl and seed
> QUALIFICATION
> select count(*)
> from
> (select distinct c_last_name as cln1, c_first_name as cfn1, d_date as
> ddate1, 1 as notnull1
> from store_sales
> JOIN date_dim ON store_sales.ss_sold_date_sk = date_dim.d_date_sk
> JOIN customer ON store_sales.ss_customer_sk = customer.c_customer_sk
> where
> d_month_seq between 1200 and 1200+11
> ) tmp1
> left outer join
> (select distinct c_last_name as cln2, c_first_name as cfn2, d_date as
> ddate2, 1 as notnull2
> from catalog_sales
> JOIN date_dim ON catalog_sales.cs_sold_date_sk = date_dim.d_date_sk
> JOIN customer ON catalog_sales.cs_bill_customer_sk =
> customer.c_customer_sk
> where
> d_month_seq between 1200 and 1200+11
> ) tmp2
> on (tmp1.cln1 = tmp2.cln2)
> and (tmp1.cfn1 = tmp2.cfn2)
> and (tmp1.ddate1= tmp2.ddate2)
> left outer join
> (select distinct c_last_name as cln3, c_first_name as cfn3 , d_date as
> ddate3, 1 as notnull3
> from web_sales
> JOIN date_dim ON web_sales.ws_sold_date_sk = date_dim.d_date_sk
> JOIN customer ON web_sales.ws_bill_customer_sk =
> customer.c_customer_sk
> where
> d_month_seq between 1200 and 1200+11
> ) tmp3
> on (tmp1.cln1 = tmp3.cln3)
> and (tmp1.cfn1 = tmp3.cfn3)
> and (tmp1.ddate1= tmp3.ddate3)
> where
> notnull2 is null and notnull3 is null
> ;
> -- end query 87 in stream 0 using template query87.tpl
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]