attilapiros commented on a change in pull request #24499: [SPARK-25888][Core]
Serve local disk persisted blocks by the external service after releasing
executor by dynamic allocation
URL: https://github.com/apache/spark/pull/24499#discussion_r279920833
##########
File path:
core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala
##########
@@ -427,6 +446,15 @@ class BlockManagerMasterEndpoint(
locations.remove(blockManagerId)
}
+ if (storageLevel.useDisk && externalShuffleServiceEnabled) {
Review comment:
No, as here the storage level reflects the actual state and not the desired
state.
Details:
As `UpdateBlockInfo` message only created by
`org.apache.spark.storage.BlockManagerMaster#updateBlockInfo` which only called
by `org.apache.spark.storage.BlockManager#tryToReportBlockStatus` which called
by:
1) `org.apache.spark.storage.BlockManager#reportAllBlocks`
2) `org.apache.spark.storage.BlockManager#reportBlockStatus`
In 1) you can see within the same method that `BlockStatus` is created by
`org.apache.spark.storage.BlockManager#getCurrentBlockStatus` where the
`useDisk` property only true if `diskStore` really contains the block:
https://github.com/apache/spark/blob/3e7797aa02a48e653b3c167a62cba6512a1d3bcb/core/src/main/scala/org/apache/spark/storage/BlockManager.scala#L693-L697
For 2) there is a comment which says here explicitly the storage level
reflects the actual and not the desired state:
https://github.com/apache/spark/blob/3e7797aa02a48e653b3c167a62cba6512a1d3bcb/core/src/main/scala/org/apache/spark/storage/BlockManager.scala#L644-L653
But I checked the calls of
`org.apache.spark.storage.BlockManager#reportBlockStatus` too and the
`BlockStatus` is either `BlockStatus.empty` or computed by
`org.apache.spark.storage.BlockManager#getCurrentBlockStatus` which is analysed
in point 1).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]