[ 
https://issues.apache.org/jira/browse/SPARK-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Or updated SPARK-1777:
-----------------------------

    Description: 
Currently in Spark we entirely unroll a partition and then check whether it 
will cause us to exceed the storage limit. This has an obvious problem - if the 
partition itself is enough to push us over the storage limit (and eventually 
over the JVM heap), it will cause an OOM.

This can happen in cases where a single partition is very large or when someone 
is running examples locally with a small heap.

https://github.com/apache/spark/blob/f6ff2a61d00d12481bfb211ae13d6992daacdcc2/core/src/main/scala/org/apache/spark/CacheManager.scala#L148

We should think a bit about the most elegant way to fix this - it shares some 
similarities with the external aggregation code.

A simple idea is to periodically check the size of the buffer as we are 
unrolling and see if we are over the memory limit. If we are we could prepend 
the existing buffer to the iterator and write that entire thing out to disk.

  was:
Currently in Spark we entirely unroll a partition and then check whether it 
will cause us to exceed the storage limit. This has an obvious problem - if the 
partition itself is enough to push us over the storage limit (and eventually 
over the JVM heap), it will cause an OOM.

This can happen in cases where a single partition is very large or when someone 
is running examples locally with a small heap.

https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/CacheManager.scala#L106

We should think a bit about the most elegant way to fix this - it shares some 
similarities with the external aggregation code.

A simple idea is to periodically check the size of the buffer as we are 
unrolling and see if we are over the memory limit. If we are we could prepend 
the existing buffer to the iterator and write that entire thing out to disk.


> Pass "cached" blocks directly to disk if memory is not large enough
> -------------------------------------------------------------------
>
>                 Key: SPARK-1777
>                 URL: https://issues.apache.org/jira/browse/SPARK-1777
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>            Reporter: Patrick Wendell
>            Assignee: Andrew Or
>            Priority: Critical
>             Fix For: 1.1.0
>
>         Attachments: spark-1777-design-doc.pdf
>
>
> Currently in Spark we entirely unroll a partition and then check whether it 
> will cause us to exceed the storage limit. This has an obvious problem - if 
> the partition itself is enough to push us over the storage limit (and 
> eventually over the JVM heap), it will cause an OOM.
> This can happen in cases where a single partition is very large or when 
> someone is running examples locally with a small heap.
> https://github.com/apache/spark/blob/f6ff2a61d00d12481bfb211ae13d6992daacdcc2/core/src/main/scala/org/apache/spark/CacheManager.scala#L148
> We should think a bit about the most elegant way to fix this - it shares some 
> similarities with the external aggregation code.
> A simple idea is to periodically check the size of the buffer as we are 
> unrolling and see if we are over the memory limit. If we are we could prepend 
> the existing buffer to the iterator and write that entire thing out to disk.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to