mridulm commented on code in PR #39459:
URL: https://github.com/apache/spark/pull/39459#discussion_r1098177596


##########
core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala:
##########
@@ -77,6 +77,11 @@ class BlockManagerMasterEndpoint(
   // Mapping from block id to the set of block managers that have the block.
   private val blockLocations = new JHashMap[BlockId, 
mutable.HashSet[BlockManagerId]]
 
+  // Mapping from task id to the set of rdd blocks which are generated from 
the task.
+  private val tidToRddBlockIds = new mutable.HashMap[Long, 
mutable.HashSet[RDDBlockId]]
+  // Record the visible RDD blocks which have been generated at least from one 
successful task.
+  private val visibleRDDBlocks = new mutable.HashSet[RDDBlockId]

Review Comment:
   Not sure I follow - we cannot leverage the block until it is explicitly 
marked visible (unless within same task).
   If the executor is lost before that happens, then we can never make it 
visible - and so fine to clean up from invisible blocks, right ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to