Github user dmateusp commented on the issue:

    https://github.com/apache/spark/pull/22141
  
    I reproduced the issue with the following code (was a bit surprised with 
the behavior)
    
    The tables:
    ```scala
    scala> spark.sql("SELECT * FROM users").show
    +---+-------+
    | id|country|
    +---+-------+
    |  0|     10|
    |  1|     20|
    +---+-------+
    
    
    scala> spark.sql("SELECT * FROM countries").show
    +---+--------+
    | id|    name|
    +---+--------+
    | 10|Portugal|
    +---+--------+
    ```
    
    Without the OR:
    ```scala
    scala> spark.sql("SELECT * FROM users u WHERE u.country NOT IN (SELECT id 
from countries)").show
    +---+-------+
    | id|country|
    +---+-------+
    |  1|     20|
    +---+-------+
    ```
    
    With an OR and IN:
    scala> spark.sql("SELECT * FROM users u WHERE 1=0 OR u.country IN (SELECT 
id from countries)").show
    +---+-------+
    | id|country|
    +---+-------+
    |  0|     10|
    +---+-------+
    
    With an OR and NOT IN:
    ```scala
    scala> spark.sql("SELECT * FROM users u WHERE 1=0 OR u.country NOT IN 
(SELECT id from countries)").show
    org.apache.spark.sql.AnalysisException: Null-aware predicate sub-queries 
cannot be used in nested conditions: ((1 = 0) || NOT country#9 IN (list#62 
[]));;
    ```
    
    +1 to get that fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to