Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10776#discussion_r49962876
  
    --- Diff: core/src/main/scala/org/apache/spark/CacheManager.scala ---
    @@ -68,12 +67,8 @@ private[spark] class CacheManager(blockManager: 
BlockManager) extends Logging {
               logInfo(s"Partition $key not found, computing it")
               val computedValues = rdd.computeOrReadCheckpoint(partition, 
context)
     
    -          // Otherwise, cache the values and keep track of any updates in 
block statuses
    -          val updatedBlocks = new ArrayBuffer[(BlockId, BlockStatus)]
    -          val cachedValues = putInBlockManager(key, computedValues, 
storageLevel, updatedBlocks)
    -          val metrics = context.taskMetrics
    -          val lastUpdatedBlocks = 
metrics.updatedBlocks.getOrElse(Seq[(BlockId, BlockStatus)]())
    -          metrics.updatedBlocks = Some(lastUpdatedBlocks ++ 
updatedBlocks.toSeq)
    --- End diff --
    
    This update should be pushed into the `putInBlockManager` call itself, but 
there it's only used in `memoryStore.unrollSafely`. Since unroll safely can 
trigger evictions, I think that this update should be covered by just pushing 
the block update into the memory store eviction pathway. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to