Terry Siu created SPARK-4226: -------------------------------- Summary: SparkSQL - Add support for subqueries in predicates Key: SPARK-4226 URL: https://issues.apache.org/jira/browse/SPARK-4226 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 1.2.0 Environment: Spark 1.2 snapshot Reporter: Terry Siu
I have a test table defined in Hive as follows: CREATE TABLE sparkbug ( id INT, event STRING ) STORED AS PARQUET; and insert some sample data with ids 1, 2, 3. In a Spark shell, I then create a HiveContext and then execute the following HQL to test out subquery predicates: val hc = HiveContext(hc) hc.hql("select customerid from sparkbug where customerid in (select customerid from sparkbug where customerid in (2,3))") I get the following error: java.lang.RuntimeException: Unsupported language features in query: select customerid from sparkbug where customerid in (select customerid from sparkbug where customerid in (2,3)) TOK_QUERY TOK_FROM TOK_TABREF TOK_TABNAME sparkbug TOK_INSERT TOK_DESTINATION TOK_DIR TOK_TMP_FILE TOK_SELECT TOK_SELEXPR TOK_TABLE_OR_COL customerid TOK_WHERE TOK_SUBQUERY_EXPR TOK_SUBQUERY_OP in TOK_QUERY TOK_FROM TOK_TABREF TOK_TABNAME sparkbug TOK_INSERT TOK_DESTINATION TOK_DIR TOK_TMP_FILE TOK_SELECT TOK_SELEXPR TOK_TABLE_OR_COL customerid TOK_WHERE TOK_FUNCTION in TOK_TABLE_OR_COL customerid 2 3 TOK_TABLE_OR_COL customerid scala.NotImplementedError: No parse rules for ASTNode type: 817, text: TOK_SUBQUERY_EXPR : TOK_SUBQUERY_EXPR TOK_SUBQUERY_OP in TOK_QUERY TOK_FROM TOK_TABREF TOK_TABNAME sparkbug TOK_INSERT TOK_DESTINATION TOK_DIR TOK_TMP_FILE TOK_SELECT TOK_SELEXPR TOK_TABLE_OR_COL customerid TOK_WHERE TOK_FUNCTION in TOK_TABLE_OR_COL customerid 2 3 TOK_TABLE_OR_COL customerid " + org.apache.spark.sql.hive.HiveQl$.nodeToExpr(HiveQl.scala:1098) at scala.sys.package$.error(package.scala:27) at org.apache.spark.sql.hive.HiveQl$.createPlan(HiveQl.scala:252) at org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:50) at org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:49) at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136) This thread http://apache-spark-user-list.1001560.n3.nabble.com/Subquery-in-having-clause-Spark-1-1-0-td17401.html also brings up lack of subquery support in SparkSQL. It would be nice to have subquery predicate support in a near, future release (1.3, maybe?). -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org