Github user maropu commented on the issue:
https://github.com/apache/spark/pull/20174
(This is not related to this pr and too trivial things though, I just leave
comments) `PropagateEmptyRelation` does not collapse
`spark.emptyDataFrame.dropDuplicates` because `spark.emptyDataFrame` uses
`ExistingRDD` instead of empty `LocalRelation`;
```
scala> spark.emptyDataFrame.dropDuplicates.explain(true)
== Parsed Logical Plan ==
Deduplicate
+- AnalysisBarrier LogicalRDD false
== Analyzed Logical Plan ==
Deduplicate
+- LogicalRDD false
== Optimized Logical Plan ==
Aggregate
+- LogicalRDD false
== Physical Plan ==
*HashAggregate(keys=[], functions=[], output=[])
+- Exchange SinglePartition
+- *HashAggregate(keys=[], functions=[], output=[])
+- Scan ExistingRDD[]
scala> Seq.empty[Tuple2[Int, Int]].toDF("a",
"b").dropDuplicates.explain(true)
== Parsed Logical Plan ==
Deduplicate [a#8, b#9]
+- AnalysisBarrier Project [_1#5 AS a#8, _2#6 AS b#9]
== Analyzed Logical Plan ==
a: int, b: int
Deduplicate [a#8, b#9]
+- Project [_1#5 AS a#8, _2#6 AS b#9]
+- LocalRelation <empty>, [_1#5, _2#6]
== Optimized Logical Plan ==
LocalRelation <empty>, [a#8, b#9]
== Physical Plan ==
LocalTableScan <empty>, [a#8, b#9]
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]