Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/22456#discussion_r218666270
--- Diff: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala ---
@@ -31,7 +31,7 @@ import org.apache.spark.util.Utils
/**
* Result returned by a ShuffleMapTask to a scheduler. Includes the block
manager address that the
- * task ran on, the sizes of outputs for each reducer, and the number of
outputs of the map task,
+ * task ran on, the sizes of outputs for each reducer, and the number of
records of the map task,
--- End diff --
size was about bytes; so it doesn't really matter whether it's a record or
a row or a block. it's also already pointed out below that it's about bytes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]