[ 
https://issues.apache.org/jira/browse/SPARK-6157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Or updated SPARK-6157:
-----------------------------
    Description: 
=== EDIT by andrewor14 ===

The existing description was somewhat confusing, so here's a more succinct 
version of it.

If unrolling a block with MEMORY_AND_DISK was unsuccessful, we will drop the 
block to disk
directly. After doing so, however, we don't need the underlying array that held 
the partial
values anymore, so we should release the pending unroll memory for other tasks 
on the same
executor. Otherwise, other tasks may unnecessarily drop their blocks to disk 
due to the lack
of unroll space, resulting in worse performance.

=======================

Current code:
Now we want to cache a Memory_and_disk level block
1. Try to put in memory and unroll unsuccessful. then reserved unroll memory 
because we got a iterator from an unroll Array 
2. Then put into disk.
3. Get value from get(blockId), and iterator from that value, and then nothing 
with an unroll Array. So here we should release the reserved unroll memory 
instead will release  until the task is end.

and also, have somebody already pull a request, for get Memory_and_disk level 
block, while cache in memory from disk, we should, use file.length to check if 
we can put in memory store instead just allocate a file.length buffer, may lead 
to OOM.

  was:
Current code:
Now we want to cache a Memory_and_disk level block
1. Try to put in memory and unroll unsuccessful. then reserved unroll memory 
because we got a iterator from an unroll Array 
2. Then put into disk.
3. Get value from get(blockId), and iterator from that value, and then nothing 
with an unroll Array. So here we should release the reserved unroll memory 
instead will release  until the task is end.

and also, have somebody already pull a request, for get Memory_and_disk level 
block, while cache in memory from disk, we should, use file.length to check if 
we can put in memory store instead just allocate a file.length buffer, may lead 
to OOM.


> Unroll unsuccessful memory_and_disk level block should release reserved 
> unroll memory after put success in disk
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-6157
>                 URL: https://issues.apache.org/jira/browse/SPARK-6157
>             Project: Spark
>          Issue Type: Bug
>          Components: Block Manager
>    Affects Versions: 1.2.1
>            Reporter: SuYan
>
> === EDIT by andrewor14 ===
> The existing description was somewhat confusing, so here's a more succinct 
> version of it.
> If unrolling a block with MEMORY_AND_DISK was unsuccessful, we will drop the 
> block to disk
> directly. After doing so, however, we don't need the underlying array that 
> held the partial
> values anymore, so we should release the pending unroll memory for other 
> tasks on the same
> executor. Otherwise, other tasks may unnecessarily drop their blocks to disk 
> due to the lack
> of unroll space, resulting in worse performance.
> =======================
> Current code:
> Now we want to cache a Memory_and_disk level block
> 1. Try to put in memory and unroll unsuccessful. then reserved unroll memory 
> because we got a iterator from an unroll Array 
> 2. Then put into disk.
> 3. Get value from get(blockId), and iterator from that value, and then 
> nothing with an unroll Array. So here we should release the reserved unroll 
> memory instead will release  until the task is end.
> and also, have somebody already pull a request, for get Memory_and_disk level 
> block, while cache in memory from disk, we should, use file.length to check 
> if we can put in memory store instead just allocate a file.length buffer, may 
> lead to OOM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to