[ 
https://issues.apache.org/jira/browse/SPARK-19017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15799388#comment-15799388
 ] 

Nattavut Sutyanyong edited comment on SPARK-19017 at 1/5/17 2:59 AM:
---------------------------------------------------------------------

{code}
On c1: (1 <> 2) or (2 <> null) => true or unknown => true
On c2: (1 <> 1) or (2 <> null) => false or unknown => unknown <-- This is 
correct, right?
{code}

In (P1 or P2), if P1 is true, we can short-circuit and conclude the result is 
true. In the latter case, if P1 is false, we have to continue evaluating P2 and 
since P2 is unknown, the result is unknown.


was (Author: nsyca):
{code}
On c1: (1 <> 2) or (2 <> null) => true or unknown => true
On c2: (1 <> 1) or (2 <> null) => false or unknown => unknown <-- This is 
correct, right?
{code}

In (P1 or P2), if P1 is true, we can short-circuit and conclude the result is 
true. In the latter case, if P1 is false, we have to continue evaluating P2 and 
since P2 is unknown, the result is unknown.

I may confess it took me a few iterations to work out the whole story as well.

> NOT IN subquery with more than one column may return incorrect results
> ----------------------------------------------------------------------
>
>                 Key: SPARK-19017
>                 URL: https://issues.apache.org/jira/browse/SPARK-19017
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.0.0, 2.0.1, 2.0.2, 2.1.0
>            Reporter: Nattavut Sutyanyong
>
> When putting more than one column in the NOT IN, the query may not return 
> correctly if there is a null data. We can demonstrate the problem with the 
> following data set and query:
> {code}
> Seq((2,1)).toDF("a1","b1").createOrReplaceTempView("t1")
> Seq[(java.lang.Integer,java.lang.Integer)]((1,null)).toDF("a2","b2").createOrReplaceTempView("t2")
> sql("select * from t1 where (a1,b1) not in (select a2,b2 from t2)").show
> +---+---+
> | a1| b1|
> +---+---+
> +---+---+
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to