Github user liancheng commented on the issue:
https://github.com/apache/spark/pull/20174
@mgaido91 We can't because we do not know whether there are any input rows
or not. For example:
```scala
val df1 = spark.range(10).select()
val df2 = spark.range(10).filter($"id" < 0).select()
val df3 = df1.dropDuplicates()
val df4 = df2.dropDuplicates()
```
`df1` has zero columns and ten rows while `df2` has no columns and zero
rows. Therefore, `df3` should return one row containing zero columns while
`df4` should return zero rows.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]