Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/21291#discussion_r187529583
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -2767,7 +2767,12 @@ class Dataset[T] private[sql](
* @since 1.6.0
*/
def count(): Long = withAction("count",
groupBy().count().queryExecution) { plan =>
- plan.executeCollect().head.getLong(0)
+ val collected = plan.executeCollect()
+ if (collected.isEmpty) {
+ 0
+ } else {
+ collected.head.getLong(0)
+ }
--- End diff --
`spark.range(-10, -9, -20, 1).select("id").count` in `DataFrameRangeSuite`
causes exception here. `plan.executeCollect().head` pulls empty iterator by
calling `next`.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]