[
https://issues.apache.org/jira/browse/SPARK-13865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
JESSE CHEN updated SPARK-13865:
-------------------------------
Description:
Testing Spark SQL using TPC queries. Query 87 returns wrong results compared to
official result set. This is at 1GB SF (validation run).
SparkSQL returns count of 47555, answer set expects 47298.
Actual results:
[47555]
Expected:
+-------+
| 1 |
+-------+
| 47298 |
+-------+
Query used:
-- start query 87 in stream 0 using template query87.tpl and seed QUALIFICATION
select count(*)
from
(select distinct c_last_name as cln1, c_first_name as cfn1, d_date as
ddate1, 1 as notnull1
from store_sales
JOIN date_dim ON store_sales.ss_sold_date_sk = date_dim.d_date_sk
JOIN customer ON store_sales.ss_customer_sk = customer.c_customer_sk
where
d_month_seq between 1200 and 1200+11
) tmp1
left outer join
(select distinct c_last_name as cln2, c_first_name as cfn2, d_date as
ddate2, 1 as notnull2
from catalog_sales
JOIN date_dim ON catalog_sales.cs_sold_date_sk = date_dim.d_date_sk
JOIN customer ON catalog_sales.cs_bill_customer_sk =
customer.c_customer_sk
where
d_month_seq between 1200 and 1200+11
) tmp2
on (tmp1.cln1 = tmp2.cln2)
and (tmp1.cfn1 = tmp2.cfn2)
and (tmp1.ddate1= tmp2.ddate2)
left outer join
(select distinct c_last_name as cln3, c_first_name as cfn3 , d_date as
ddate3, 1 as notnull3
from web_sales
JOIN date_dim ON web_sales.ws_sold_date_sk = date_dim.d_date_sk
JOIN customer ON web_sales.ws_bill_customer_sk = customer.c_customer_sk
where
d_month_seq between 1200 and 1200+11
) tmp3
on (tmp1.cln1 = tmp3.cln3)
and (tmp1.cfn1 = tmp3.cfn3)
and (tmp1.ddate1= tmp3.ddate3)
where
notnull2 is null and notnull3 is null
;
-- end query 87 in stream 0 using template query87.tpl
was:
Testing Spark SQL using TPC queries. Query 74 returns wrong results compared to
official result set. This is at 1GB SF (validation run).
Spark SQL has right answer but in wrong order (and there is an 'order by' in
the query).
Actual results:
[AAAAAAAABLEIBAAA,Paula,Wakefield]
[AAAAAAAADFIEBAAA,John,Gray]
[AAAAAAAAOCLBBAAA,null,null]
[AAAAAAAAPKBCBAAA,Andrea,White]
[AAAAAAAAEJDLAAAA,Alice,Wright]
[AAAAAAAAFACEAAAA,Priscilla,Miller]
[AAAAAAAALFKKAAAA,Ignacio,Miller]
[AAAAAAAALJNCBAAA,George,Gamez]
[AAAAAAAALIOPAAAA,Derek,Allen]
[AAAAAAAAEADJAAAA,Ruth,Carroll]
[AAAAAAAAJGMMAAAA,Richard,Larson]
[AAAAAAAAPKIKAAAA,Wendy,Horvath]
[AAAAAAAAFJHFAAAA,Larissa,Roy]
[AAAAAAAAEPOGAAAA,Felisha,Mendes]
[AAAAAAAAEKJLAAAA,Aisha,Carlson]
[AAAAAAAAHNFHAAAA,Rebecca,Wilson]
[AAAAAAAAIBFCBAAA,Ruth,Grantham]
[AAAAAAAAOPDLAAAA,Ann,Pence]
[AAAAAAAANIPLAAAA,Eric,Lawrence]
[AAAAAAAAOCICAAAA,Zachary,Pennington]
[AAAAAAAAOFLCAAAA,James,Taylor]
[AAAAAAAAGEHIAAAA,Tyler,Miller]
[AAAAAAAACADPAAAA,Cristobal,Thomas]
[AAAAAAAAJIALAAAA,Santos,Gutierrez]
[AAAAAAAAPMMBBAAA,Paul,Jordan]
[AAAAAAAADIIOAAAA,David,Carroll]
[AAAAAAAADFKABAAA,Latoya,Craft]
[AAAAAAAAHMOIAAAA,Grace,Henderson]
[AAAAAAAAPPIBBAAA,Candice,Lee]
[AAAAAAAAJONHBAAA,Warren,Orozco]
[AAAAAAAAGNDAAAAA,Terry,Mcdowell]
[AAAAAAAACIJMAAAA,Elizabeth,Thomas]
[AAAAAAAADIJGBAAA,Ruth,Sanders]
[AAAAAAAANFBDBAAA,Vernice,Fernandez]
[AAAAAAAAIDKFAAAA,Michael,Mack]
[AAAAAAAAIMHBAAAA,Kathy,Knowles]
[AAAAAAAALHMCAAAA,Brooke,Nelson]
[AAAAAAAACFCGBAAA,Marcus,Sanders]
[AAAAAAAANJHCBAAA,Christopher,Schreiber]
[AAAAAAAAPDFBAAAA,Terrance,Banks]
[AAAAAAAAANFAAAAA,Philip,Banks]
[AAAAAAAAIADEBAAA,Diane,Aldridge]
[AAAAAAAAICHFAAAA,Linda,Mccoy]
[AAAAAAAACFENAAAA,Christopher,Dawson]
[AAAAAAAAKOJJAAAA,Gracie,Mendoza]
[AAAAAAAAFOJAAAAA,Don,Castillo]
[AAAAAAAAFGPGAAAA,Albert,Wadsworth]
[AAAAAAAAKJBKAAAA,Georgia,Scott]
[AAAAAAAAEKFPAAAA,Annika,Chin]
[AAAAAAAAIBAEBAAA,Sandra,Wilson]
[AAAAAAAAMFFLAAAA,Margret,Gray]
[AAAAAAAAKNAKAAAA,Gladys,Banks]
[AAAAAAAACJDIAAAA,James,Kerr]
[AAAAAAAAOBADBAAA,Elizabeth,Burnham]
[AAAAAAAAAMGDAAAA,Kenneth,Harlan]
[AAAAAAAAHJLAAAAA,Audrey,Beltran]
[AAAAAAAAAOPFBAAA,Jerry,Fields]
[AAAAAAAACNAGBAAA,Virginia,May]
[AAAAAAAAHGOABAAA,Sonia,White]
[AAAAAAAAKBCABAAA,Debra,Bell]
[AAAAAAAANJAGAAAA,Allen,Hood]
[AAAAAAAAMMOBBAAA,Margaret,Smith]
[AAAAAAAANGDBBAAA,Carlos,Jewell]
[AAAAAAAAFOGIAAAA,Michelle,Greene]
[AAAAAAAAJEKFBAAA,Norma,Burkholder]
[AAAAAAAAOCAJAAAA,Jenna,Staton]
[AAAAAAAAPFCLAAAA,Felicia,Neville]
[AAAAAAAADLHBBAAA,Henry,Bertrand]
[AAAAAAAADBEFBAAA,Bennie,Bowers]
[AAAAAAAADCKOAAAA,Robert,Gonzalez]
[AAAAAAAAKKGEAAAA,Katie,Dunbar]
[AAAAAAAAGFMDBAAA,Kathleen,Gibson]
[AAAAAAAAIJEMAAAA,Charlie,Cummings]
[AAAAAAAAKJBLAAAA,Kerry,Davis]
[AAAAAAAAJKBNAAAA,Julie,Kern]
[AAAAAAAAMDCAAAAA,Louann,Hamel]
[AAAAAAAAEOAKAAAA,Molly,Benjamin]
[AAAAAAAAIBHHAAAA,Jennifer,Ballard]
[AAAAAAAAPJENAAAA,Ashley,Norton]
[AAAAAAAAKLHHBAAA,Manuel,Castaneda]
[AAAAAAAAIMHHBAAA,Lillian,Davidson]
[AAAAAAAAGHPBBAAA,Nick,Mendez]
[AAAAAAAABNBBAAAA,Irma,Smith]
[AAAAAAAAFBAHAAAA,Michael,Williams]
[AAAAAAAAPEHEBAAA,Edith,Molina]
[AAAAAAAAFMHIAAAA,Emilio,Darling]
[AAAAAAAAKAECAAAA,Milton,Mackey]
[AAAAAAAAOCDJAAAA,Nina,Sanchez]
[AAAAAAAAFGIGAAAA,Eduardo,Miller]
[AAAAAAAAFHACBAAA,null,null]
[AAAAAAAAHMJNAAAA,Ryan,Baptiste]
[AAAAAAAAHHCABAAA,William,Stewart]
Expected results:
+------------------+---------------------+--------------------+
| CUSTOMER_ID | CUSTOMER_FIRST_NAME | CUSTOMER_LAST_NAME |
+------------------+---------------------+--------------------+
| AAAAAAAAAMGDAAAA | Kenneth | Harlan |
| AAAAAAAAANFAAAAA | Philip | Banks |
| AAAAAAAAAOPFBAAA | Jerry | Fields |
| AAAAAAAABLEIBAAA | Paula | Wakefield |
| AAAAAAAABNBBAAAA | Irma | Smith |
| AAAAAAAACADPAAAA | Cristobal | Thomas |
| AAAAAAAACFCGBAAA | Marcus | Sanders |
| AAAAAAAACFENAAAA | Christopher | Dawson |
| AAAAAAAACIJMAAAA | Elizabeth | Thomas |
| AAAAAAAACJDIAAAA | James | Kerr |
| AAAAAAAACNAGBAAA | Virginia | May |
| AAAAAAAADBEFBAAA | Bennie | Bowers |
| AAAAAAAADCKOAAAA | Robert | Gonzalez |
| AAAAAAAADFIEBAAA | John | Gray |
| AAAAAAAADFKABAAA | Latoya | Craft |
| AAAAAAAADIIOAAAA | David | Carroll |
| AAAAAAAADIJGBAAA | Ruth | Sanders |
| AAAAAAAADLHBBAAA | Henry | Bertrand |
| AAAAAAAAEADJAAAA | Ruth | Carroll |
| AAAAAAAAEJDLAAAA | Alice | Wright |
| AAAAAAAAEKFPAAAA | Annika | Chin |
| AAAAAAAAEKJLAAAA | Aisha | Carlson |
| AAAAAAAAEOAKAAAA | Molly | Benjamin |
| AAAAAAAAEPOGAAAA | Felisha | Mendes |
| AAAAAAAAFACEAAAA | Priscilla | Miller |
| AAAAAAAAFBAHAAAA | Michael | Williams |
| AAAAAAAAFGIGAAAA | Eduardo | Miller |
| AAAAAAAAFGPGAAAA | Albert | Wadsworth |
| AAAAAAAAFHACBAAA | [NULL] | [NULL] |
| AAAAAAAAFJHFAAAA | Larissa | Roy |
| AAAAAAAAFMHIAAAA | Emilio | Darling |
| AAAAAAAAFOGIAAAA | Michelle | Greene |
| AAAAAAAAFOJAAAAA | Don | Castillo |
| AAAAAAAAGEHIAAAA | Tyler | Miller |
| AAAAAAAAGFMDBAAA | Kathleen | Gibson |
| AAAAAAAAGHPBBAAA | Nick | Mendez |
| AAAAAAAAGNDAAAAA | Terry | Mcdowell |
| AAAAAAAAHGOABAAA | Sonia | White |
| AAAAAAAAHHCABAAA | William | Stewart |
| AAAAAAAAHJLAAAAA | Audrey | Beltran |
| AAAAAAAAHMJNAAAA | Ryan | Baptiste |
| AAAAAAAAHMOIAAAA | Grace | Henderson |
| AAAAAAAAHNFHAAAA | Rebecca | Wilson |
| AAAAAAAAIADEBAAA | Diane | Aldridge |
| AAAAAAAAIBAEBAAA | Sandra | Wilson |
| AAAAAAAAIBFCBAAA | Ruth | Grantham |
| AAAAAAAAIBHHAAAA | Jennifer | Ballard |
| AAAAAAAAICHFAAAA | Linda | Mccoy |
| AAAAAAAAIDKFAAAA | Michael | Mack |
| AAAAAAAAIJEMAAAA | Charlie | Cummings |
| AAAAAAAAIMHBAAAA | Kathy | Knowles |
| AAAAAAAAIMHHBAAA | Lillian | Davidson |
| AAAAAAAAJEKFBAAA | Norma | Burkholder |
| AAAAAAAAJGMMAAAA | Richard | Larson |
| AAAAAAAAJIALAAAA | Santos | Gutierrez |
| AAAAAAAAJKBNAAAA | Julie | Kern |
| AAAAAAAAJONHBAAA | Warren | Orozco |
| AAAAAAAAKAECAAAA | Milton | Mackey |
| AAAAAAAAKBCABAAA | Debra | Bell |
| AAAAAAAAKJBKAAAA | Georgia | Scott |
| AAAAAAAAKJBLAAAA | Kerry | Davis |
| AAAAAAAAKKGEAAAA | Katie | Dunbar |
| AAAAAAAAKLHHBAAA | Manuel | Castaneda |
| AAAAAAAAKNAKAAAA | Gladys | Banks |
| AAAAAAAAKOJJAAAA | Gracie | Mendoza |
| AAAAAAAALFKKAAAA | Ignacio | Miller |
| AAAAAAAALHMCAAAA | Brooke | Nelson |
| AAAAAAAALIOPAAAA | Derek | Allen |
| AAAAAAAALJNCBAAA | George | Gamez |
| AAAAAAAAMDCAAAAA | Louann | Hamel |
| AAAAAAAAMFFLAAAA | Margret | Gray |
| AAAAAAAAMMOBBAAA | Margaret | Smith |
| AAAAAAAANFBDBAAA | Vernice | Fernandez |
| AAAAAAAANGDBBAAA | Carlos | Jewell |
| AAAAAAAANIPLAAAA | Eric | Lawrence |
| AAAAAAAANJAGAAAA | Allen | Hood |
| AAAAAAAANJHCBAAA | Christopher | Schreiber |
| AAAAAAAAOBADBAAA | Elizabeth | Burnham |
| AAAAAAAAOCAJAAAA | Jenna | Staton |
| AAAAAAAAOCDJAAAA | Nina | Sanchez |
| AAAAAAAAOCICAAAA | Zachary | Pennington |
| AAAAAAAAOCLBBAAA | [NULL] | [NULL] |
| AAAAAAAAOFLCAAAA | James | Taylor |
| AAAAAAAAOPDLAAAA | Ann | Pence |
| AAAAAAAAPDFBAAAA | Terrance | Banks |
| AAAAAAAAPEHEBAAA | Edith | Molina |
| AAAAAAAAPFCLAAAA | Felicia | Neville |
| AAAAAAAAPJENAAAA | Ashley | Norton |
| AAAAAAAAPKBCBAAA | Andrea | White |
| AAAAAAAAPKIKAAAA | Wendy | Horvath |
| AAAAAAAAPMMBBAAA | Paul | Jordan |
| AAAAAAAAPPIBBAAA | Candice | Lee |
+------------------+---------------------+--------------------+
Query used:
-- start query 74 in stream 0 using template query74.tpl and seed QUALIFICATION
with year_total as (
select c_customer_id customer_id
,c_first_name customer_first_name
,c_last_name customer_last_name
,d_year as year
,sum(ss_net_paid) year_total
,'s' sale_type
from customer
,store_sales
,date_dim
where c_customer_sk = ss_customer_sk
and ss_sold_date_sk = d_date_sk
and d_year in (2001,2001+1)
group by c_customer_id
,c_first_name
,c_last_name
,d_year
union all
select c_customer_id customer_id
,c_first_name customer_first_name
,c_last_name customer_last_name
,d_year as year
,sum(ws_net_paid) year_total
,'w' sale_type
from customer
,web_sales
,date_dim
where c_customer_sk = ws_bill_customer_sk
and ws_sold_date_sk = d_date_sk
and d_year in (2001,2001+1)
group by c_customer_id
,c_first_name
,c_last_name
,d_year
)
select
t_s_secyear.customer_id, t_s_secyear.customer_first_name,
t_s_secyear.customer_last_name
from year_total t_s_firstyear
,year_total t_s_secyear
,year_total t_w_firstyear
,year_total t_w_secyear
where t_s_secyear.customer_id = t_s_firstyear.customer_id
and t_s_firstyear.customer_id = t_w_secyear.customer_id
and t_s_firstyear.customer_id = t_w_firstyear.customer_id
and t_s_firstyear.sale_type = 's'
and t_w_firstyear.sale_type = 'w'
and t_s_secyear.sale_type = 's'
and t_w_secyear.sale_type = 'w'
and t_s_firstyear.year = 2001
and t_s_secyear.year = 2001+1
and t_w_firstyear.year = 2001
and t_w_secyear.year = 2001+1
and t_s_firstyear.year_total > 0
and t_w_firstyear.year_total > 0
and case when t_w_firstyear.year_total > 0 then t_w_secyear.year_total
/ t_w_firstyear.year_total else null end
> case when t_s_firstyear.year_total > 0 then t_s_secyear.year_total
/ t_s_firstyear.year_total else null end
order by 1,1,1
limit 100;
-- end query 74 in stream 0 using template query74.tpl
> TPCDS query 87 returns wrong results compared to TPC official result set
> -------------------------------------------------------------------------
>
> Key: SPARK-13865
> URL: https://issues.apache.org/jira/browse/SPARK-13865
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.6.0
> Reporter: JESSE CHEN
>
> Testing Spark SQL using TPC queries. Query 87 returns wrong results compared
> to official result set. This is at 1GB SF (validation run).
> SparkSQL returns count of 47555, answer set expects 47298.
> Actual results:
> [47555]
> Expected:
> +-------+
> | 1 |
> +-------+
> | 47298 |
> +-------+
> Query used:
> -- start query 87 in stream 0 using template query87.tpl and seed
> QUALIFICATION
> select count(*)
> from
> (select distinct c_last_name as cln1, c_first_name as cfn1, d_date as
> ddate1, 1 as notnull1
> from store_sales
> JOIN date_dim ON store_sales.ss_sold_date_sk = date_dim.d_date_sk
> JOIN customer ON store_sales.ss_customer_sk = customer.c_customer_sk
> where
> d_month_seq between 1200 and 1200+11
> ) tmp1
> left outer join
> (select distinct c_last_name as cln2, c_first_name as cfn2, d_date as
> ddate2, 1 as notnull2
> from catalog_sales
> JOIN date_dim ON catalog_sales.cs_sold_date_sk = date_dim.d_date_sk
> JOIN customer ON catalog_sales.cs_bill_customer_sk =
> customer.c_customer_sk
> where
> d_month_seq between 1200 and 1200+11
> ) tmp2
> on (tmp1.cln1 = tmp2.cln2)
> and (tmp1.cfn1 = tmp2.cfn2)
> and (tmp1.ddate1= tmp2.ddate2)
> left outer join
> (select distinct c_last_name as cln3, c_first_name as cfn3 , d_date as
> ddate3, 1 as notnull3
> from web_sales
> JOIN date_dim ON web_sales.ws_sold_date_sk = date_dim.d_date_sk
> JOIN customer ON web_sales.ws_bill_customer_sk =
> customer.c_customer_sk
> where
> d_month_seq between 1200 and 1200+11
> ) tmp3
> on (tmp1.cln1 = tmp3.cln3)
> and (tmp1.cfn1 = tmp3.cfn3)
> and (tmp1.ddate1= tmp3.ddate3)
> where
> notnull2 is null and notnull3 is null
> ;
> -- end query 87 in stream 0 using template query87.tpl
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]