// version 0.9.0

Hi Spark users,

My understanding of the MEMORY_AND_DISK_SER persistence level was that if
an RDD could fit into memory then it would be left there (same as
MEMORY_ONLY), and only if it was too big for memory would it spill to disk.
 Here's how the docs describe it:

MEMORY_AND_DISK_SER Similar to MEMORY_ONLY_SER, but spill partitions that
don't fit in memory to disk instead of recomputing them on the fly each
time they're needed.
https://spark.incubator.apache.org/docs/latest/scala-programming-guide.html



What I'm observing though is that really large RDDs are actually causing
OOMs.  I'm not sure if this is a regression in 0.9.0 or if it has been this
way for some time.

While I look through the source code, has anyone actually observed the
correct spill to disk behavior rather than an OOM?

Thanks!
Andrew

Reply via email to