Github user juanrh commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19583#discussion_r148078337
  
    --- Diff: 
core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala 
---
    @@ -241,6 +243,21 @@ final class ShuffleBlockFetcherIterator(
             logError(s"Failed to get block(s) from 
${req.address.host}:${req.address.port}", e)
             results.put(new FailureFetchResult(BlockId(blockId), address, e))
           }
    +
    +      override def shouldRetry(t: Throwable): Boolean = {
    --- End diff --
    
    The idea here is that `shouldRetry` checks again the map from 
`BlockManagerId` to blocks, because `_blocksByAddress` is a by name parameter, 
and therefore ` _blocksByAddress.toMap` is actually recomputing that 
`Seq[(BlockManagerId, Seq[(BlockId, Long)])]` that is passed in 
`BlockStoreShuffleReader.read` as 
`mapOutputTracker.getMapSizesByExecutorId(handle.shuffleId, startPartition, 
endPartition)`. So `shouldRetry` indirectly calls 
`mapOutputTracker.getMapSizesByExecutorId` and this might even lead to a 
`MetadataFetchFailedException` for the missing executor


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to