[ https://issues.apache.org/jira/browse/SPARK-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
JESSE CHEN closed SPARK-13863. ------------------------------ Resolution: Workaround fixed schema. > TPCDS query 66 returns wrong results compared to TPC official result set > ------------------------------------------------------------------------- > > Key: SPARK-13863 > URL: https://issues.apache.org/jira/browse/SPARK-13863 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.6.0 > Reporter: JESSE CHEN > Labels: tpcds-result-mismatch > > Testing Spark SQL using TPC queries. Query 66 returns wrong results compared > to official result set. This is at 1GB SF (validation run). > Aggregations slightly off -- eg. JAN_SALES column of "Doors canno" row - > SparkSQL returns 6355232.185385704, expected 6355232.31 > Actual results: > {noformat} > [null,null,Fairview,Williamson County,TN,United > States,DHL,BARIAN,2001,9597806.850651741,1.1121820530080795E7,8670867.81564045,8994785.945689201,1.088724806326294E7,1.4187671518377304E7,9732598.460139751,1.9798897020946026E7,2.1007842467959404E7,2.149551364927292E7,3.479566905774999E7,3.3122997954660416E7,null,null,null,null,null,null,null,null,null,null,null,null,2.1913594697555542E7,3.2518476414670944E7,2.48856624883976E7,2.5698343830046654E7,3.373591080598068E7,3.552703167087555E7,2.5465193481492043E7,5.362323870799959E7,5.1409986978201866E7,5.415917383586836E7,9.222704311805725E7,8.343539111531019E7] > [Bad cards must make.,621234,Fairview,Williamson County,TN,United > States,DHL,BARIAN,2001,9506753.593884468,8008140.429557085,6116769.711647987,1.1973045160133362E7,7756254.925520897,5352978.574095726,1.373399613500309E7,1.6418794411203384E7,1.7212743279764652E7,1.704270732417488E7,3.43049358570323E7,3.532416421229005E7,15.30301560102066,12.890698882477594,9.846160563729589,19.273003667109915,12.485238936569628,8.61668642427125,22.107605403121994,26.429323590150222,27.707342611261865,27.433635834765774,55.22063482847413,56.86128610521969,3.0534943928382874E7,2.4481686250203133E7,2.217871080008793E7,2.569579825610423E7,2.995490355044937E7,1.8084140250833035E7,3.0805576178061485E7,4.7156887432252884E7,5.115858869637826E7,5.5759943171424866E7,8.625354428184557E7,8.345155532035494E7] > [Conventional childr,977787,Fairview,Williamson County,TN,United > States,DHL,BARIAN,2001,8860645.460736752,1.441581376543355E7,6761497.232810497,1.1820654735879421E7,8246260.600341797,6636877.482845306,1.1434492123092413E7,2.5673812070380323E7,2.3074206999911785E7,2.1834582007320404E7,2.6894900596512794E7,3.357509177109933E7,9.061938296108202,14.743306840276613,6.9151024024767125,12.08919195681618,8.43359606984118,6.787651587559771,11.694256645969329,26.257060147435304,23.598398219562938,22.330611889215547,27.505888906799534,34.337838170377935,2.3836085704864502E7,3.20733132298584E7,2.503790437837982E7,2.2659895963564873E7,2.175740087420273E7,2.4451608012176514E7,2.1933001734852314E7,5.59967034604629E7,5.737188052299309E7,6.208721474336243E7,8.284991027382469E7,8.897031933202875E7] > [Doors canno,294242,Fairview,Williamson County,TN,United > States,DHL,BARIAN,2001,6355232.185385704,1.0198920296742141E7,1.0246200903741479E7,1.2209716492156029E7,8566998.262890816,8806316.75278151,9789405.6993227,1.646658496404171E7,2.6443785668474197E7,2.701604788320923E7,3.366058958298761E7,2.7462468750599384E7,21.59865751791282,34.66167405313361,34.822360178837414,41.495491779406166,29.115484067165177,29.928823053070296,33.26991285854059,55.96272783641258,89.87087386734116,91.81574310672585,114.39763726112386,93.33293258813964,2.2645142994330406E7,2.448725452685547E7,2.4925759290207863E7,3.0503655031727314E7,2.6558160276379585E7,2.0976233452690125E7,2.9895796101181984E7,5.600219855566597E7,5.348815865275085E7,7.628723580410767E7,8.248374754962921E7,8.808826726185608E7] > [Important issues liv,138504,Fairview,Williamson County,TN,United > States,DHL,BARIAN,2001,1.1748784594717264E7,1.435130566355586E7,9896470.867572784,7990874.805492401,8879247.840401173,7362383.04259038,1.0011144724414349E7,1.7741201390372872E7,2.1346976135887742E7,1.8074978020030975E7,2.967512567988676E7,3.2545325348875403E7,84.8263197793368,103.6165429414014,71.45259969078715,57.694180713137534,64.10824120892663,53.156465102743454,72.28054586448297,128.09161750110374,154.12534032149065,130.5014874662896,214.25464737398747,234.97751219369408,2.7204167203903973E7,2.598037822457385E7,1.9943398915802002E7,2.5710421112384796E7,1.948448105346489E7,2.6346611484448195E7,2.5075158296625137E7,5.409477817043829E7,4.106673223178029E7,5.454705814340496E7,7.246596285337901E7,9.277032812079096E7] > {noformat} > Expected results: > {noformat} > +----------------------+-------------------+----------+-------------------+---------+---------------+---------------+------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+ > | W_WAREHOUSE_NAME | W_WAREHOUSE_SQ_FT | W_CITY | W_COUNTY | > W_STATE | W_COUNTRY | SHIP_CARRIERS | YEAR | JAN_SALES | > FEB_SALES | MAR_SALES | APR_SALES | MAY_SALES | JUN_SALES > | JUL_SALES | AUG_SALES | SEP_SALES | OCT_SALES | > NOV_SALES | DEC_SALES | JAN_SALES_PER_SQ_FOOT | FEB_SALES_PER_SQ_FOOT | > MAR_SALES_PER_SQ_FOOT | APR_SALES_PER_SQ_FOOT | MAY_SALES_PER_SQ_FOOT | > JUN_SALES_PER_SQ_FOOT | JUL_SALES_PER_SQ_FOOT | AUG_SALES_PER_SQ_FOOT | > SEP_SALES_PER_SQ_FOOT | OCT_SALES_PER_SQ_FOOT | NOV_SALES_PER_SQ_FOOT | > DEC_SALES_PER_SQ_FOOT | JAN_NET | FEB_NET | MAR_NET | > APR_NET | MAY_NET | JUN_NET | JUL_NET | > AUG_NET | SEP_NET | OCT_NET | NOV_NET | DEC_NET | > +----------------------+-------------------+----------+-------------------+---------+---------------+---------------+------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+ > | Bad cards must make. | 621234 | Fairview | Williamson County | > TN | United States | DHL,BARIAN | 2001 | 9506753.46000 | > 8008140.33000 | 6116769.63000 | 11973045.15000 | 7756254.92000 | > 5352978.49000 | 13733996.10000 | 16418794.37000 | 17212743.32000 | > 17042707.41000 | 34304935.61000 | 35324164.21000 | 15.30302 | > 12.89070 | 9.84616 | 19.27300 | > 12.48524 | 8.61669 | 22.10761 | > 26.42932 | 27.70734 | 27.43364 | > 55.22063 | 56.86129 | 30534943.77000 | 24481685.94000 | > 22178710.81000 | 25695798.18000 | 29954903.78000 | 18084140.05000 | > 30805576.13000 | 47156887.22000 | 51158588.86000 | 55759942.80000 | > 86253544.16000 | 83451555.63000 | > | Conventional childr | 977787 | Fairview | Williamson County | > TN | United States | DHL,BARIAN | 2001 | 8860645.55000 | > 14415813.74000 | 6761497.23000 | 11820654.76000 | 8246260.69000 | > 6636877.49000 | 11434492.25000 | 25673812.14000 | 23074206.96000 | > 21834581.94000 | 26894900.53000 | 33575091.74000 | 9.06194 | > 14.74331 | 6.91510 | 12.08919 | > 8.43360 | 6.78765 | 11.69426 | > 26.25706 | 23.59840 | 22.33061 | > 27.50589 | 34.33784 | 23836085.83000 | 32073313.37000 | > 25037904.18000 | 22659895.86000 | 21757401.03000 | 24451608.10000 | > 21933001.85000 | 55996703.43000 | 57371880.44000 | 62087214.51000 | > 82849910.15000 | 88970319.31000 | > | Doors canno | 294242 | Fairview | Williamson County | > TN | United States | DHL,BARIAN | 2001 | 6355232.31000 | > 10198920.36000 | 10246200.97000 | 12209716.50000 | 8566998.28000 | > 8806316.81000 | 9789405.60000 | 16466584.88000 | 26443785.61000 | > 27016047.80000 | 33660589.67000 | 27462468.62000 | 21.59866 | > 34.66167 | 34.82236 | 41.49549 | > 29.11548 | 29.92882 | 33.26991 | > 55.96273 | 89.87087 | 91.81574 | > 114.39764 | 93.33293 | 22645143.09000 | 24487254.60000 | > 24925759.42000 | 30503655.27000 | 26558160.29000 | 20976233.52000 | > 29895796.09000 | 56002198.38000 | 53488158.53000 | 76287235.46000 | > 82483747.59000 | 88088266.69000 | > | Important issues liv | 138504 | Fairview | Williamson County | > TN | United States | DHL,BARIAN | 2001 | 11748784.55000 | > 14351305.77000 | 9896470.93000 | 7990874.78000 | 8879247.90000 | > 7362383.09000 | 10011144.75000 | 17741201.32000 | 21346976.05000 | > 18074978.16000 | 29675125.64000 | 32545325.29000 | 84.82632 | > 103.61654 | 71.45260 | 57.69418 | > 64.10824 | 53.15647 | 72.28055 | > 128.09162 | 154.12534 | 130.50149 | > 214.25465 | 234.97751 | 27204167.15000 | 25980378.13000 | > 19943398.93000 | 25710421.13000 | 19484481.03000 | 26346611.48000 | > 25075158.43000 | 54094778.13000 | 41066732.11000 | 54547058.28000 | > 72465962.92000 | 92770328.27000 | > | [NULL] | [NULL] | Fairview | Williamson County | > TN | United States | DHL,BARIAN | 2001 | 9597806.95000 | > 11121820.57000 | 8670867.91000 | 8994786.04000 | 10887248.09000 | > 14187671.36000 | 9732598.41000 | 19798897.07000 | 21007842.34000 | > 21495513.67000 | 34795669.17000 | 33122997.94000 | [NULL] | > [NULL] | [NULL] | [NULL] | > [NULL] | [NULL] | [NULL] | > [NULL] | [NULL] | [NULL] | > [NULL] | [NULL] | 21913594.59000 | 32518476.51000 | > 24885662.72000 | 25698343.86000 | 33735910.61000 | 35527031.58000 | > 25465193.48000 | 53623238.66000 | 51409986.76000 | 54159173.90000 | > 92227043.25000 | 83435390.84000 | > +----------------------+-------------------+----------+-------------------+---------+---------------+---------------+------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+ > {noformat} > Query used: > {noformat} > -- start query 66 in stream 0 using template query66.tpl and seed > QUALIFICATION > select > w_warehouse_name > ,w_warehouse_sq_ft > ,w_city > ,w_county > ,w_state > ,w_country > ,ship_carriers > ,year > ,sum(jan_sales) as jan_sales > ,sum(feb_sales) as feb_sales > ,sum(mar_sales) as mar_sales > ,sum(apr_sales) as apr_sales > ,sum(may_sales) as may_sales > ,sum(jun_sales) as jun_sales > ,sum(jul_sales) as jul_sales > ,sum(aug_sales) as aug_sales > ,sum(sep_sales) as sep_sales > ,sum(oct_sales) as oct_sales > ,sum(nov_sales) as nov_sales > ,sum(dec_sales) as dec_sales > ,sum(jan_sales/w_warehouse_sq_ft) as jan_sales_per_sq_foot > ,sum(feb_sales/w_warehouse_sq_ft) as feb_sales_per_sq_foot > ,sum(mar_sales/w_warehouse_sq_ft) as mar_sales_per_sq_foot > ,sum(apr_sales/w_warehouse_sq_ft) as apr_sales_per_sq_foot > ,sum(may_sales/w_warehouse_sq_ft) as may_sales_per_sq_foot > ,sum(jun_sales/w_warehouse_sq_ft) as jun_sales_per_sq_foot > ,sum(jul_sales/w_warehouse_sq_ft) as jul_sales_per_sq_foot > ,sum(aug_sales/w_warehouse_sq_ft) as aug_sales_per_sq_foot > ,sum(sep_sales/w_warehouse_sq_ft) as sep_sales_per_sq_foot > ,sum(oct_sales/w_warehouse_sq_ft) as oct_sales_per_sq_foot > ,sum(nov_sales/w_warehouse_sq_ft) as nov_sales_per_sq_foot > ,sum(dec_sales/w_warehouse_sq_ft) as dec_sales_per_sq_foot > ,sum(jan_net) as jan_net > ,sum(feb_net) as feb_net > ,sum(mar_net) as mar_net > ,sum(apr_net) as apr_net > ,sum(may_net) as may_net > ,sum(jun_net) as jun_net > ,sum(jul_net) as jul_net > ,sum(aug_net) as aug_net > ,sum(sep_net) as sep_net > ,sum(oct_net) as oct_net > ,sum(nov_net) as nov_net > ,sum(dec_net) as dec_net > from ( > select > w_warehouse_name > ,w_warehouse_sq_ft > ,w_city > ,w_county > ,w_state > ,w_country > ,concat('DHL', ',', 'BARIAN') as ship_carriers > ,d_year as year > ,sum(case when d_moy = 1 > then ws_ext_sales_price* ws_quantity else 0 end) as jan_sales > ,sum(case when d_moy = 2 > then ws_ext_sales_price* ws_quantity else 0 end) as feb_sales > ,sum(case when d_moy = 3 > then ws_ext_sales_price* ws_quantity else 0 end) as mar_sales > ,sum(case when d_moy = 4 > then ws_ext_sales_price* ws_quantity else 0 end) as apr_sales > ,sum(case when d_moy = 5 > then ws_ext_sales_price* ws_quantity else 0 end) as may_sales > ,sum(case when d_moy = 6 > then ws_ext_sales_price* ws_quantity else 0 end) as jun_sales > ,sum(case when d_moy = 7 > then ws_ext_sales_price* ws_quantity else 0 end) as jul_sales > ,sum(case when d_moy = 8 > then ws_ext_sales_price* ws_quantity else 0 end) as aug_sales > ,sum(case when d_moy = 9 > then ws_ext_sales_price* ws_quantity else 0 end) as sep_sales > ,sum(case when d_moy = 10 > then ws_ext_sales_price* ws_quantity else 0 end) as oct_sales > ,sum(case when d_moy = 11 > then ws_ext_sales_price* ws_quantity else 0 end) as nov_sales > ,sum(case when d_moy = 12 > then ws_ext_sales_price* ws_quantity else 0 end) as dec_sales > ,sum(case when d_moy = 1 > then ws_net_paid * ws_quantity else 0 end) as jan_net > ,sum(case when d_moy = 2 > then ws_net_paid * ws_quantity else 0 end) as feb_net > ,sum(case when d_moy = 3 > then ws_net_paid * ws_quantity else 0 end) as mar_net > ,sum(case when d_moy = 4 > then ws_net_paid * ws_quantity else 0 end) as apr_net > ,sum(case when d_moy = 5 > then ws_net_paid * ws_quantity else 0 end) as may_net > ,sum(case when d_moy = 6 > then ws_net_paid * ws_quantity else 0 end) as jun_net > ,sum(case when d_moy = 7 > then ws_net_paid * ws_quantity else 0 end) as jul_net > ,sum(case when d_moy = 8 > then ws_net_paid * ws_quantity else 0 end) as aug_net > ,sum(case when d_moy = 9 > then ws_net_paid * ws_quantity else 0 end) as sep_net > ,sum(case when d_moy = 10 > then ws_net_paid * ws_quantity else 0 end) as oct_net > ,sum(case when d_moy = 11 > then ws_net_paid * ws_quantity else 0 end) as nov_net > ,sum(case when d_moy = 12 > then ws_net_paid * ws_quantity else 0 end) as dec_net > from > web_sales > ,warehouse > ,date_dim > ,time_dim > ,ship_mode > where > ws_warehouse_sk = w_warehouse_sk > and ws_sold_date_sk = d_date_sk > and ws_sold_time_sk = t_time_sk > and ws_ship_mode_sk = sm_ship_mode_sk > and d_year = 2001 > and t_time between 30838 and 30838+28800 > and sm_carrier in ('DHL','BARIAN') > group by > w_warehouse_name > ,w_warehouse_sq_ft > ,w_city > ,w_county > ,w_state > ,w_country > ,d_year > union all > select > w_warehouse_name > ,w_warehouse_sq_ft > ,w_city > ,w_county > ,w_state > ,w_country > ,concat('DHL', ',', 'BARIAN') as ship_carriers > ,d_year as year > ,sum(case when d_moy = 1 > then cs_sales_price* cs_quantity else 0 end) as jan_sales > ,sum(case when d_moy = 2 > then cs_sales_price* cs_quantity else 0 end) as feb_sales > ,sum(case when d_moy = 3 > then cs_sales_price* cs_quantity else 0 end) as mar_sales > ,sum(case when d_moy = 4 > then cs_sales_price* cs_quantity else 0 end) as apr_sales > ,sum(case when d_moy = 5 > then cs_sales_price* cs_quantity else 0 end) as may_sales > ,sum(case when d_moy = 6 > then cs_sales_price* cs_quantity else 0 end) as jun_sales > ,sum(case when d_moy = 7 > then cs_sales_price* cs_quantity else 0 end) as jul_sales > ,sum(case when d_moy = 8 > then cs_sales_price* cs_quantity else 0 end) as aug_sales > ,sum(case when d_moy = 9 > then cs_sales_price* cs_quantity else 0 end) as sep_sales > ,sum(case when d_moy = 10 > then cs_sales_price* cs_quantity else 0 end) as oct_sales > ,sum(case when d_moy = 11 > then cs_sales_price* cs_quantity else 0 end) as nov_sales > ,sum(case when d_moy = 12 > then cs_sales_price* cs_quantity else 0 end) as dec_sales > ,sum(case when d_moy = 1 > then cs_net_paid_inc_tax * cs_quantity else 0 end) as jan_net > ,sum(case when d_moy = 2 > then cs_net_paid_inc_tax * cs_quantity else 0 end) as feb_net > ,sum(case when d_moy = 3 > then cs_net_paid_inc_tax * cs_quantity else 0 end) as mar_net > ,sum(case when d_moy = 4 > then cs_net_paid_inc_tax * cs_quantity else 0 end) as apr_net > ,sum(case when d_moy = 5 > then cs_net_paid_inc_tax * cs_quantity else 0 end) as may_net > ,sum(case when d_moy = 6 > then cs_net_paid_inc_tax * cs_quantity else 0 end) as jun_net > ,sum(case when d_moy = 7 > then cs_net_paid_inc_tax * cs_quantity else 0 end) as jul_net > ,sum(case when d_moy = 8 > then cs_net_paid_inc_tax * cs_quantity else 0 end) as aug_net > ,sum(case when d_moy = 9 > then cs_net_paid_inc_tax * cs_quantity else 0 end) as sep_net > ,sum(case when d_moy = 10 > then cs_net_paid_inc_tax * cs_quantity else 0 end) as oct_net > ,sum(case when d_moy = 11 > then cs_net_paid_inc_tax * cs_quantity else 0 end) as nov_net > ,sum(case when d_moy = 12 > then cs_net_paid_inc_tax * cs_quantity else 0 end) as dec_net > from > catalog_sales > ,warehouse > ,date_dim > ,time_dim > ,ship_mode > where > cs_warehouse_sk = w_warehouse_sk > and cs_sold_date_sk = d_date_sk > and cs_sold_time_sk = t_time_sk > and cs_ship_mode_sk = sm_ship_mode_sk > and d_year = 2001 > and t_time between 30838 and 30838+28800 > and sm_carrier in ('DHL','BARIAN') > group by > w_warehouse_name > ,w_warehouse_sq_ft > ,w_city > ,w_county > ,w_state > ,w_country > ,d_year > ) x > group by > w_warehouse_name > ,w_warehouse_sq_ft > ,w_city > ,w_county > ,w_state > ,w_country > ,ship_carriers > ,year > order by w_warehouse_name > limit 100; > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org