[
https://issues.apache.org/jira/browse/SPARK-6157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Josh Rosen reopened SPARK-6157:
-------------------------------
> Unrolling with MEMORY_AND_DISK should always release memory
> -----------------------------------------------------------
>
> Key: SPARK-6157
> URL: https://issues.apache.org/jira/browse/SPARK-6157
> Project: Spark
> Issue Type: Bug
> Components: Block Manager
> Affects Versions: 1.2.1
> Reporter: SuYan
>
> === EDIT by andrewor14 ===
> The existing description was somewhat confusing, so here's a more succinct
> version of it.
> If unrolling a block with MEMORY_AND_DISK was unsuccessful, we will drop the
> block to disk
> directly. After doing so, however, we don't need the underlying array that
> held the partial
> values anymore, so we should release the pending unroll memory for other
> tasks on the same
> executor. Otherwise, other tasks may unnecessarily drop their blocks to disk
> due to the lack
> of unroll space, resulting in worse performance.
> === Original comment ===
> Current code:
> Now we want to cache a Memory_and_disk level block
> 1. Try to put in memory and unroll unsuccessful. then reserved unroll memory
> because we got a iterator from an unroll Array
> 2. Then put into disk.
> 3. Get value from get(blockId), and iterator from that value, and then
> nothing with an unroll Array. So here we should release the reserved unroll
> memory instead will release until the task is end.
> and also, have somebody already pull a request, for get Memory_and_disk level
> block, while cache in memory from disk, we should, use file.length to check
> if we can put in memory store instead just allocate a file.length buffer, may
> lead to OOM.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]