Github user juanrh commented on a diff in the pull request:
https://github.com/apache/spark/pull/19583#discussion_r148078337
--- Diff:
core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala
---
@@ -241,6 +243,21 @@ final class ShuffleBlockFetcherIterator(
logError(s"Failed to get block(s) from
${req.address.host}:${req.address.port}", e)
results.put(new FailureFetchResult(BlockId(blockId), address, e))
}
+
+ override def shouldRetry(t: Throwable): Boolean = {
--- End diff --
The idea here is that `shouldRetry` checks again the map from
`BlockManagerId` to blocks, because `_blocksByAddress` is a by name parameter,
and therefore ` _blocksByAddress.toMap` is actually recomputing that
`Seq[(BlockManagerId, Seq[(BlockId, Long)])]` that is passed in
`BlockStoreShuffleReader.read` as
`mapOutputTracker.getMapSizesByExecutorId(handle.shuffleId, startPartition,
endPartition)`. So `shouldRetry` indirectly calls
`mapOutputTracker.getMapSizesByExecutorId` and this might even lead to a
`MetadataFetchFailedException` for the missing executor
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]