How StorageLevel, CacheManager and checkpointing influence computing RDD partitions?

Jacek Laskowski Sat, 10 Oct 2015 07:38:16 -0700

Hi,

I've been reviewing the Spark code and noticed that `iterator` method
of RDD [1] does a check whether RDD has a non-NONE storage and calls
`computeOrReadCheckpoint` private method [2] that checks RDD
checkpointing.


Is there a doc on how StorageLevel, CacheManager and checkpointing
influence partition computation?

Specifically, why would I have NONE StorageLevel and RDD checkpointing
enabled? What is the use case for such a configuration? What about the
other options?

Any pointers are greatly appreciated, including blog posts,
StackOverflow, Quora, archive.

[1] 
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/rdd/RDD.scala#L260-L266
[2] 
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/rdd/RDD.scala#L292-L298

Pozdrawiam,
Jacek

--
Jacek Laskowski | http://blog.japila.pl | http://blog.jaceklaskowski.pl
Follow me at https://twitter.com/jaceklaskowski
Upvote at http://stackoverflow.com/users/1305344/jacek-laskowski

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

How StorageLevel, CacheManager and checkpointing influence computing RDD partitions?

Reply via email to