Santiago M. Mola created SPARK-8654:
---------------------------------------

             Summary: Analysis exception when using "NULL IN (...)": invalid 
cast
                 Key: SPARK-8654
                 URL: https://issues.apache.org/jira/browse/SPARK-8654
             Project: Spark
          Issue Type: Bug
          Components: SQL
            Reporter: Santiago M. Mola
            Priority: Minor


The following query throws an analysis exception:

{code}
SELECT * FROM t WHERE NULL NOT IN (1, 2, 3);
{code}

The exception is:

{code}
org.apache.spark.sql.AnalysisException: invalid cast from int to null;
        at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:38)
        at 
org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:42)
        at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:66)
        at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:52)
{code}

Here is a test that can be added to AnalysisSuite to check the issue:

{code}
  test("SPARK-XXXX regression test") {
    val plan = Project(Alias(In(Literal(null), Seq(Literal(1), Literal(2))), 
"a")() :: Nil,
      LocalRelation()
    )
    caseInsensitiveAnalyze(plan)
  }
{code}

Note that this kind of query is a corner case, but it is still valid SQL. An 
expression such as "NULL IN (...)" or "NULL NOT IN (...)" always gives NULL as 
a result, even if the list contains NULL. So it is safe to translate these 
expressions to Literal(null) during analysis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to