Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/21527#discussion_r218639496
--- Diff: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala ---
@@ -50,7 +50,9 @@ private[spark] sealed trait MapStatus {
private[spark] object MapStatus {
def apply(loc: BlockManagerId, uncompressedSizes: Array[Long]):
MapStatus = {
- if (uncompressedSizes.length > 2000) {
+ if (uncompressedSizes.length > Option(SparkEnv.get)
--- End diff --
this should be done once, rather than for every constructor, shouldn't it?
might introduce a hot codepath for very large workloads here.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]