Github user dmateusp commented on the issue:
https://github.com/apache/spark/pull/22141
I reproduced the issue with the following code (was a bit surprised with
the behavior)
The tables:
```scala
scala> spark.sql("SELECT * FROM users").show
+---+-------+
| id|country|
+---+-------+
| 0| 10|
| 1| 20|
+---+-------+
scala> spark.sql("SELECT * FROM countries").show
+---+--------+
| id| name|
+---+--------+
| 10|Portugal|
+---+--------+
```
Without the OR:
```scala
scala> spark.sql("SELECT * FROM users u WHERE u.country NOT IN (SELECT id
from countries)").show
+---+-------+
| id|country|
+---+-------+
| 1| 20|
+---+-------+
```
With an OR and IN:
scala> spark.sql("SELECT * FROM users u WHERE 1=0 OR u.country IN (SELECT
id from countries)").show
+---+-------+
| id|country|
+---+-------+
| 0| 10|
+---+-------+
With an OR and NOT IN:
```scala
scala> spark.sql("SELECT * FROM users u WHERE 1=0 OR u.country NOT IN
(SELECT id from countries)").show
org.apache.spark.sql.AnalysisException: Null-aware predicate sub-queries
cannot be used in nested conditions: ((1 = 0) || NOT country#9 IN (list#62
[]));;
```
+1 to get that fixed
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]