Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/19476#discussion_r144857068
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala
---
@@ -653,15 +663,34 @@ private[spark] class BlockManager(
require(blockId != null, "BlockId is null")
var runningFailureCount = 0
var totalFailureCount = 0
- val locations = getLocations(blockId)
+
+ // Because all the remote blocks are registered in driver, so it is
not necessary to ask
+ // all the slave executors to get block status.
+ val locationAndStatus = master.getLocationsAndStatus(blockId)
+
+ val blockSize = locationAndStatus._2.map { status =>
+ // Disk size and mem size cannot co-exist, so it's ok to sum them
together to get block size.
+ status.diskSize + status.memSize
--- End diff --
I think it's dangerous, because you assume the default value of
diskSize/memSize is always 0, but this is not guaranteed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]