I’m looking at the docs here:

http://spark.apache.org/docs/1.6.2/api/python/pyspark.html#pyspark.StorageLevel
<http://spark.apache.org/docs/latest/api/python/pyspark.html#pyspark.StorageLevel>

A newcomer to Spark won’t understand the meaning of _2, or the meaning of
_SER (or its value), and won’t understand how exactly memory and disk play
together when something like MEMORY_AND_DISK is selected.

Is there a place in the docs that expands on the storage levels a bit? If
not, shall we create a JIRA and expand this documentation? I don’t mind
taking on this task, though frankly I’m interested in this because I don’t
fully understand the differences myself. :)

Nick
​

Reply via email to