Difference between Checkpointing and Persist

Subash Prabakar Thu, 18 Apr 2019 10:49:43 -0700

Hi All,

I have a doubt about checkpointing and persist/saving.


Say we have one RDD - containing huge data,
1. We checkpoint and perform join
2. We persist as StorageLevel.MEMORY_AND_DISK and perform join
3. We save that intermediate RDD and perform join (using same RDD - saving
is to just persist intermediate result before joining)


Which of the above is faster and whats the difference?


Thanks,
Subash

Difference between Checkpointing and Persist

Reply via email to