attilapiros commented on a change in pull request #24499: [SPARK-25888][Core] 
Serve local disk persisted blocks by the external service after releasing 
executor by dynamic allocation
URL: https://github.com/apache/spark/pull/24499#discussion_r279920833
 
 

 ##########
 File path: 
core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala
 ##########
 @@ -427,6 +446,15 @@ class BlockManagerMasterEndpoint(
       locations.remove(blockManagerId)
     }
 
+    if (storageLevel.useDisk && externalShuffleServiceEnabled) {
 
 Review comment:
   No, as here the storage level reflects the actual state and not the desired 
state. 
   
   Details:
   As `UpdateBlockInfo` message only created by 
`org.apache.spark.storage.BlockManagerMaster#updateBlockInfo` which only called 
by `org.apache.spark.storage.BlockManager#tryToReportBlockStatus` which called 
by:
   1) `org.apache.spark.storage.BlockManager#reportAllBlocks`
   2) `org.apache.spark.storage.BlockManager#reportBlockStatus`
   
   In 1) you can see within the same method that `BlockStatus` is created by 
`org.apache.spark.storage.BlockManager#getCurrentBlockStatus` where the 
`useDisk` property only true if `diskStore` really contains the block:
   
   
https://github.com/apache/spark/blob/3e7797aa02a48e653b3c167a62cba6512a1d3bcb/core/src/main/scala/org/apache/spark/storage/BlockManager.scala#L693-L697
   
   For 2) there is a comment which says here explicitly the storage level 
reflects the actual and not the desired state: 
   
https://github.com/apache/spark/blob/3e7797aa02a48e653b3c167a62cba6512a1d3bcb/core/src/main/scala/org/apache/spark/storage/BlockManager.scala#L644-L653
   
   But I checked the calls of 
`org.apache.spark.storage.BlockManager#reportBlockStatus` too and the 
`BlockStatus` is either `BlockStatus.empty` or computed by  
`org.apache.spark.storage.BlockManager#getCurrentBlockStatus` which is analysed 
in point 1).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to