[
https://issues.apache.org/jira/browse/SPARK-12757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15092646#comment-15092646
]
Apache Spark commented on SPARK-12757:
--------------------------------------
User 'JoshRosen' has created a pull request for this issue:
https://github.com/apache/spark/pull/10705
> Use reference counting to prevent blocks from being evicted during reads
> ------------------------------------------------------------------------
>
> Key: SPARK-12757
> URL: https://issues.apache.org/jira/browse/SPARK-12757
> Project: Spark
> Issue Type: Improvement
> Components: Block Manager
> Reporter: Josh Rosen
> Assignee: Josh Rosen
>
> As a pre-requisite to off-heap caching of blocks, we need a mechanism to
> prevent pages / blocks from being evicted while they are being read. With
> on-heap objects, evicting a block while it is being read merely leads to
> memory-accounting problems (because we assume that an evicted block is a
> candidate for garbage-collection, which will not be true during a read), but
> with off-heap memory this will lead to either data corruption or segmentation
> faults.
> To address this, we should add a reference-counting mechanism to track which
> blocks/pages are being read in order to prevent them from being evicted
> prematurely. I propose to do this in two phases: first, add a safe,
> conservative approach in which all BlockManager.get*() calls implicitly
> increment the reference count of blocks and where tasks' references are
> automatically freed upon task completion. This will be correct but may have
> adverse performance impacts because it will prevent legitimate block
> evictions. In phase two, we should incrementally add release() calls in order
> to fix the eviction of unreferenced blocks. The latter change may need to
> touch many different components, which is why I propose to do it separately
> in order to make the changes easier to reason about and review.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]