GitHub user mgaido91 opened a pull request:

    https://github.com/apache/spark/pull/22029

    [SPARK-24395][SQL] IN operator should return NULL when comparing struct 
with NULL fields

    ## What changes were proposed in this pull request?
    
    Spark's IN operator behaves different from other RDBMS when structs 
containing NULL fields are involved. In this case, Spark returns `false`, while 
other RDBMS return `NULL`. This is critical especially when there are NOT IN 
filters, as Spark doesn't filter rows containing NULLs in that scenario 
(instead other RDBMS do).
    
    The PR proposes to change Spark's IN operator behavior in order to align 
with the behavior of other RDBMS and introduces a flag which can be used by 
users to switch back to the previuos behavior.
    
    ## How was this patch tested?
    
    added UTs


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mgaido91/spark SPARK-24395

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22029.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22029
    
----
commit 54ee21ad903827baf1117356f692370225c8662a
Author: Marco Gaido <marcogaido91@...>
Date:   2018-08-07T15:52:17Z

    [SPARK-24395][SQL] IN operator should return NULL when comparing struct 
with NULL fields

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to