[ https://issues.apache.org/jira/browse/SPARK-14757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Davies Liu updated SPARK-14757: ------------------------------- Assignee: Reynold Xin > Incorrect behavior of Join operation in Spqrk SQL JOIN : "false" in the left > table is joined to "null" on the right table > ------------------------------------------------------------------------------------------------------------------------- > > Key: SPARK-14757 > URL: https://issues.apache.org/jira/browse/SPARK-14757 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.6.0 > Reporter: Hong Huang > Assignee: Reynold Xin > Fix For: 1.6.2, 2.0.0 > > > Content of table a: > |outgoing_0| > | false | > | true | > | null | > a has only one field: outgoing_0 > Content of table b: > |outgoing_1| > | false | > | true | > | null | > b has only one filed: outgoing_1 > After running this query: > {code} > select * from a FULL JOIN b ON ( outgoing_0<=>outgoing_1) > {code} > I got the following result: > |outgoing_0|outgoing_1| > | true | true | > | false | false | > | false | null | > | null | null | > The row with "false" as outgoing_0 and "null" as outgoing_1 is unexpected. > The operator <=> should match null with null. > While left "false" is matched with right "null", it is also strange to find > that the "false" on the right table does not match with "null" on the left > table (no row with "null" as outgoing_0 and "false" as outgoing_1) > You can easily reproduce this bug by pasting the following code fragment: > {code} > case class A( outgoing_0: Option[Boolean] ) > case class B( outgoing_1: Option[Boolean] ) > {code} > {code} > val a = sc.parallelize( Seq( > A( Some( false ) ), > A( Some( true ) ), > A( None ) > ) ).toDF() > a.show > val b = sc.parallelize( Seq( > B( Some( false ) ), > B( Some( true ) ), > B( None ) > ) ).toDF() > b.show > a.registerTempTable( "a" ) > b.registerTempTable( "b" ) > sqlContext.sql( "select * from a FULL JOIN b ON ( > outgoing_0<=>outgoing_1)" ).show() > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org