http://docs.sigmoidanalytics.com/index.php/Checkpoint_and_not_running_out_of_disk_space
On Mon, Apr 14, 2014 at 2:43 AM, Cheng Lian wrote:
> Checkpointed RDDs are materialized on disk, while cached RDDs are
> materialized in memory. When memory is insufficient, cached RDD blocks (1
> block per
Checkpointed RDDs are materialized on disk, while cached RDDs are
materialized in memory. When memory is insufficient, cached RDD blocks (1
block per partition) will be evicted in an LRU manner. An evicted RDD block
will be spilled to disk if the storage level of the RDD allows, otherwise
this bloc
For starters cacheing may or may not be persisted on disk , but check
pointing will be.
Also cache is generic & check pointing is specific to streaming.
On Apr 14, 2014 7:51 AM, "David Thomas" wrote:
> What is the difference between checkpointing and caching an RDD?
>
What is the difference between checkpointing and caching an RDD?