They are different, also this might be better suited for the user list.
Persist by default will cache in memory on one machine, although you can
specify a different storage level. Checkpoint on the other hand will write
out to a persistent store and get rid of the dependency graph used to
compute the RDD (so it is often seen in iterative algorithms which may
build very large or complex dependency graphs over time).

On Saturday, April 30, 2016, Renyi Xiong <renyixio...@gmail.com> wrote:

> Hi,
>
> Is RDD.persist equivalent to RDD.checkpoint If they save same number of
> copies (say 3) to disk?
>
> (I assume persist saves copies on different machines ?)
>
> thanks,
> Renyi.
>
>

-- 
Cell : 425-233-8271
Twitter: https://twitter.com/holdenkarau

Reply via email to