Terry Siu created SPARK-4226:
--------------------------------
Summary: SparkSQL - Add support for subqueries in predicates
Key: SPARK-4226
URL: https://issues.apache.org/jira/browse/SPARK-4226
Project: Spark
Issue Type: Improvement
Components: SQL
Affects Versions: 1.2.0
Environment: Spark 1.2 snapshot
Reporter: Terry Siu
I have a test table defined in Hive as follows:
CREATE TABLE sparkbug (
id INT,
event STRING
) STORED AS PARQUET;
and insert some sample data with ids 1, 2, 3.
In a Spark shell, I then create a HiveContext and then execute the following
HQL to test out subquery predicates:
val hc = HiveContext(hc)
hc.hql("select customerid from sparkbug where customerid in (select customerid
from sparkbug where customerid in (2,3))")
I get the following error:
java.lang.RuntimeException: Unsupported language features in query: select
customerid from sparkbug where customerid in (select customerid from sparkbug
where customerid in (2,3))
TOK_QUERY
TOK_FROM
TOK_TABREF
TOK_TABNAME
sparkbug
TOK_INSERT
TOK_DESTINATION
TOK_DIR
TOK_TMP_FILE
TOK_SELECT
TOK_SELEXPR
TOK_TABLE_OR_COL
customerid
TOK_WHERE
TOK_SUBQUERY_EXPR
TOK_SUBQUERY_OP
in
TOK_QUERY
TOK_FROM
TOK_TABREF
TOK_TABNAME
sparkbug
TOK_INSERT
TOK_DESTINATION
TOK_DIR
TOK_TMP_FILE
TOK_SELECT
TOK_SELEXPR
TOK_TABLE_OR_COL
customerid
TOK_WHERE
TOK_FUNCTION
in
TOK_TABLE_OR_COL
customerid
2
3
TOK_TABLE_OR_COL
customerid
scala.NotImplementedError: No parse rules for ASTNode type: 817, text:
TOK_SUBQUERY_EXPR :
TOK_SUBQUERY_EXPR
TOK_SUBQUERY_OP
in
TOK_QUERY
TOK_FROM
TOK_TABREF
TOK_TABNAME
sparkbug
TOK_INSERT
TOK_DESTINATION
TOK_DIR
TOK_TMP_FILE
TOK_SELECT
TOK_SELEXPR
TOK_TABLE_OR_COL
customerid
TOK_WHERE
TOK_FUNCTION
in
TOK_TABLE_OR_COL
customerid
2
3
TOK_TABLE_OR_COL
customerid
" +
org.apache.spark.sql.hive.HiveQl$.nodeToExpr(HiveQl.scala:1098)
at scala.sys.package$.error(package.scala:27)
at org.apache.spark.sql.hive.HiveQl$.createPlan(HiveQl.scala:252)
at
org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:50)
at
org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:49)
at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136)
This thread
http://apache-spark-user-list.1001560.n3.nabble.com/Subquery-in-having-clause-Spark-1-1-0-td17401.html
also brings up lack of subquery support in SparkSQL. It would be nice to have
subquery predicate support in a near, future release (1.3, maybe?).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]