GitHub user wangyum opened a pull request:
https://github.com/apache/spark/pull/22368
[SPARK-25368][SQL] Incorrect predicate pushdown returns wrong result
## What changes were proposed in this pull request?
How to reproduce:
```scala
val df1 = spark.createDataFrame(Seq(
(1, 1)
)).toDF("a", "b").withColumn("c", lit(null).cast("int"))
val df2 = df1.union(df1).withColumn("d",
spark_partition_id).filter($"c".isNotNull)
df2.show
+---+---+----+---+
| a| b| c| d|
+---+---+----+---+
| 1| 1|null| 0|
| 1| 1|null| 1|
+---+---+----+---+
```
`filter($"c".isNotNull)`changed to `(null <=> c#10)` before
https://github.com/apache/spark/pull/19201, but it changed to `(c#10 = null)`
since https://github.com/apache/spark/pull/20155. This pr revert it to `(null
<=> c#10)` to fix this issue.
## How was this patch tested?
unit tests
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/wangyum/spark SPARK-25368
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/22368.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #22368
----
commit 86b9b7892c94be68145453f9519e35a3574fe568
Author: Yuming Wang <yumwang@...>
Date: 2018-09-09T03:46:18Z
Fix SPARK-25368
commit 865e0af572edad7fd775c25e317055ffa0df2a08
Author: Yuming Wang <yumwang@...>
Date: 2018-09-09T04:22:29Z
Fix InferFiltersFromConstraintsSuite test error
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]