Hi Everyone,
Maybe it's a good time to reevaluate off-heap storage for RDD's with
custom allocator?

On a few occasions recently I had to lower both
spark.storage.memoryFraction and spark.shuffle.memoryFraction
spark.shuffle.spill helps a bit with large scale reduces

Also it could be you're hitting:
https://github.com/apache/incubator-spark/pull/180

/Rafal


Andrew Ash wrote:
> I dropped down to 0.5 but still OOM'd, so sent it all the way to 0.1
> and didn't get an OOM.  I could tune this some more to find where the
> cliff is, but this is a one-off job so now that it's completed I don't
> want to spend any more time tuning it.
>
> Is there a reason that this value couldn't be dynamically adjusted in
> response to actual heap usage?
>
> I can imagine a scenario where spending too much time in GC
> (descending into GC hell) drops the value a little to keep from OOM,
> or directly measuring how much of the heap is spent on this scratch
> space and adjusting appropriately.
>
>
> On Sat, Feb 8, 2014 at 3:40 PM, Matei Zaharia <[email protected]
> <mailto:[email protected]>> wrote:
>
>     This probably means that there’s not enough free memory for the
>     “scratch” space used for computations, so we OOM before the Spark
>     cache decides that it’s full and starts to spill stuff. Try
>     reducing spark.storage.memoryFraction (default is 0.66, try 0.5).
>
>     Matei
>
>     On Feb 5, 2014, at 10:29 PM, Andrew Ash <[email protected]
>     <mailto:[email protected]>> wrote:
>
>>     // version 0.9.0
>>
>>     Hi Spark users,
>>
>>     My understanding of the MEMORY_AND_DISK_SER persistence level was
>>     that if an RDD could fit into memory then it would be left there
>>     (same as MEMORY_ONLY), and only if it was too big for memory
>>     would it spill to disk.  Here's how the docs describe it:
>>
>>     MEMORY_AND_DISK_SER      Similar to MEMORY_ONLY_SER, but spill
>>     partitions that don't fit in memory to disk instead of
>>     recomputing them on the fly each time they're needed.
>>
>>     
>> https://spark.incubator.apache.org/docs/latest/scala-programming-guide.html
>>
>>
>>
>>     What I'm observing though is that really large RDDs are actually
>>     causing OOMs.  I'm not sure if this is a regression in 0.9.0 or
>>     if it has been this way for some time.
>>
>>     While I look through the source code, has anyone actually
>>     observed the correct spill to disk behavior rather than an OOM?
>>
>>     Thanks!
>>     Andrew
>
>

Reply via email to