maheshk114 commented on code in PR #45228:
URL: https://github.com/apache/spark/pull/45228#discussion_r1542847717
##########
core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala:
##########
@@ -398,8 +410,13 @@ final class ShuffleBlockFetcherIterator(
var pushMergedLocalBlockBytes = 0L
val prevNumBlocksToFetch = numBlocksToFetch
- val fallback = FallbackStorage.FALLBACK_BLOCK_MANAGER_ID.executorId
- val localExecIds = Set(blockManager.blockManagerId.executorId, fallback)
+ // Fallback to original implementation, if thread pool is not enabled.
+ val localExecIds = if
(FallbackStorage.getNumReadThreads(blockManager.conf) > 0) {
Review Comment:
@attilapiros Thanks for taking a look at this PR. In the current code, the
shuffle data read from local disk and external storage is handled in same way.
It first tries to read from local disk and if it fails then it tries to read
from external storage(BlockManager.getLocalBlockData). In this PR I have
modified this behavior. The shuffle data written to external storage will be
treated as remote block. If the remote block id points to external storage,
then the read will be done in a separate thread. This code change is to make
sure that, if fallback storage read thread pool is configured, then the
fallback storage (external storage) blocks will be processed as remote block or
else it will be processed as local block.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]