Santiago M. Mola created SPARK-8654: ---------------------------------------
Summary: Analysis exception when using "NULL IN (...)": invalid cast Key: SPARK-8654 URL: https://issues.apache.org/jira/browse/SPARK-8654 Project: Spark Issue Type: Bug Components: SQL Reporter: Santiago M. Mola Priority: Minor The following query throws an analysis exception: {code} SELECT * FROM t WHERE NULL NOT IN (1, 2, 3); {code} The exception is: {code} org.apache.spark.sql.AnalysisException: invalid cast from int to null; at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:38) at org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:42) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:66) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:52) {code} Here is a test that can be added to AnalysisSuite to check the issue: {code} test("SPARK-XXXX regression test") { val plan = Project(Alias(In(Literal(null), Seq(Literal(1), Literal(2))), "a")() :: Nil, LocalRelation() ) caseInsensitiveAnalyze(plan) } {code} Note that this kind of query is a corner case, but it is still valid SQL. An expression such as "NULL IN (...)" or "NULL NOT IN (...)" always gives NULL as a result, even if the list contains NULL. So it is safe to translate these expressions to Literal(null) during analysis. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org