Roy Cecil created SPARK-13820:
---------------------------------
Summary: TPC-DS Query 10 fails to compile
Key: SPARK-13820
URL: https://issues.apache.org/jira/browse/SPARK-13820
Project: Spark
Issue Type: Bug
Affects Versions: 1.6.1
Environment: Red Hat Enterprise Linux Server release 7.1 (Maipo)
Linux bigaperf116.svl.ibm.com 3.10.0-229.el7.x86_64 #1 SMP Thu Jan 29 18:37:38
EST 2015 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Roy Cecil
TPC-DS Query 10 fails to compile with the following error.
Parsing error: KW_SELECT )=> ( KW_EXISTS subQueryExpression ) -> ^(
TOK_SUBQUERY_EXPR ^( TOK_SUBQUERY_OP KW_EXISTS ) subQueryExpression ) );])
at org.antlr.runtime.DFA.noViableAlt(DFA.java:158)
at org.antlr.runtime.DFA.predict(DFA.java:144)
at
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceEqualExpression(HiveParser_IdentifiersParser.java:8155)
at
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceNotExpression(HiveParser_IdentifiersParser.java:9177)
Parsing error: KW_SELECT )=> ( KW_EXISTS subQueryExpression ) -> ^(
TOK_SUBQUERY_EXPR ^( TOK_SUBQUERY_OP KW_EXISTS ) subQueryExpression ) );])
at org.antlr.runtime.DFA.noViableAlt(DFA.java:158)
at org.antlr.runtime.DFA.predict(DFA.java:144)
at
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceEqualExpression(HiveParser_IdentifiersParser.java:8155)
at
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceNotExpression(HiveParser_IdentifiersParser.java:9177)
Query is pasted here for easy reproduction
select
cd_gender,
cd_marital_status,
cd_education_status,
count(*) cnt1,
cd_purchase_estimate,
count(*) cnt2,
cd_credit_rating,
count(*) cnt3,
cd_dep_count,
count(*) cnt4,
cd_dep_employed_count,
count(*) cnt5,
cd_dep_college_count,
count(*) cnt6
from
customer c
JOIN customer_address ca ON c.c_current_addr_sk = ca.ca_address_sk
JOIN customer_demographics ON cd_demo_sk = c.c_current_cdemo_sk
LEFT SEMI JOIN (select ss_customer_sk
from store_sales
JOIN date_dim ON ss_sold_date_sk = d_date_sk
where
d_year = 2002 and
d_moy between 1 and 1+3) ss_wh1 ON c.c_customer_sk =
ss_wh1.ss_customer_sk
where
ca_county in ('Rush County','Toole County','Jefferson County','Dona Ana
County','La Porte County') and
exists (
select tmp.customer_sk from (
select ws_bill_customer_sk as customer_sk
from web_sales,date_dim
where
web_sales.ws_sold_date_sk = date_dim.d_date_sk and
d_year = 2002 and
d_moy between 1 and 1+3
UNION ALL
select cs_ship_customer_sk as customer_sk
from catalog_sales,date_dim
where
catalog_sales.cs_sold_date_sk = date_dim.d_date_sk and
d_year = 2002 and
d_moy between 1 and 1+3
) tmp where c.c_customer_sk = tmp.customer_sk
)
group by cd_gender,
cd_marital_status,
cd_education_status,
cd_purchase_estimate,
cd_credit_rating,
cd_dep_count,
cd_dep_employed_count,
cd_dep_college_count
order by cd_gender,
cd_marital_status,
cd_education_status,
cd_purchase_estimate,
cd_credit_rating,
cd_dep_count,
cd_dep_employed_count,
cd_dep_college_count
limit 100;
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]