GitHub user kevinyu98 opened a pull request: https://github.com/apache/spark/pull/9720
[SPARK-11447][SQL] change NullType to StringType during binaryComparison between NullType and StringType During executing PromoteStrings rule, if one side of binaryComparison is StringType and the other side is not StringType, the current code will promote(cast) the StringType to DoubleType, and if the StringType doesn't contain the numbers, it will get null value. So if it is doing <=> (NULL-safe equal) with Null, it will not filter anything, caused the problem reported by this jira. I proposal to the changes through this PR, can you review my code changes ? This problem only happen for <=>, other operators works fine. scala> val filteredDF = df.filter(df("column") > (new Column(Literal(null)))) filteredDF: org.apache.spark.sql.DataFrame = [column: string] scala> filteredDF.show +------+ |column| +------+ +------+ scala> val filteredDF = df.filter(df("column") === (new Column(Literal(null)))) filteredDF: org.apache.spark.sql.DataFrame = [column: string] scala> filteredDF.show +------+ |column| +------+ +------+ scala> df.registerTempTable("DF") scala> sqlContext.sql("select * from DF where 'column' = NULL") res27: org.apache.spark.sql.DataFrame = [column: string] scala> res27.show +------+ |column| +------+ +------+ You can merge this pull request into a Git repository by running: $ git pull https://github.com/kevinyu98/spark working_on_spark-11447 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/9720.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #9720 ---- commit b53b85cad4f5fced9ba003351d5a9af1eb5111fc Author: Kevin Yu <q...@us.ibm.com> Date: 2015-11-13T18:11:59Z [SPARK-11447]Check NullType before Promote StringType commit bb705cae18032fcee8f8a532be464f0a995b27cb Author: Kevin Yu <q...@us.ibm.com> Date: 2015-11-15T06:41:48Z add testcase in ColumnExpressionSuite ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org