Nattavut Sutyanyong created SPARK-18966:
-------------------------------------------

             Summary: NOT IN subquery with correlated expressions may return 
incorrect result
                 Key: SPARK-18966
                 URL: https://issues.apache.org/jira/browse/SPARK-18966
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.0.0
            Reporter: Nattavut Sutyanyong


{code}
Seq((1, 2)).toDF("a1", "b1").createOrReplaceTempView("t1")
Seq[(java.lang.Integer, java.lang.Integer)]((1, null)).toDF("a2", 
"b2").createOrReplaceTempView("t2")

// The expected result is 1 row of (1,2) as shown in the next statement.
sql("select * from t1 where a1 not in (select a2 from t2 where b2 = b1)").show
+---+---+
| a1| b1|
+---+---+
+---+---+

sql("select * from t1 where a1 not in (select a2 from t2 where b2 = 2)").show
+---+---+
| a1| b1|
+---+---+
|  1|  2|
+---+---+
{code}

The two SQL statements above should return the same result.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to