Nattavut Sutyanyong created SPARK-18966:
-------------------------------------------
Summary: NOT IN subquery with correlated expressions may return
incorrect result
Key: SPARK-18966
URL: https://issues.apache.org/jira/browse/SPARK-18966
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 2.0.0
Reporter: Nattavut Sutyanyong
{code}
Seq((1, 2)).toDF("a1", "b1").createOrReplaceTempView("t1")
Seq[(java.lang.Integer, java.lang.Integer)]((1, null)).toDF("a2",
"b2").createOrReplaceTempView("t2")
// The expected result is 1 row of (1,2) as shown in the next statement.
sql("select * from t1 where a1 not in (select a2 from t2 where b2 = b1)").show
+---+---+
| a1| b1|
+---+---+
+---+---+
sql("select * from t1 where a1 not in (select a2 from t2 where b2 = 2)").show
+---+---+
| a1| b1|
+---+---+
| 1| 2|
+---+---+
{code}
The two SQL statements above should return the same result.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]