In non streaming Spark checkpoints aren't for inter-application recovery, rather you can think of them as doing persist but to a HDFS rather than each nodes local memory / storage.
On Fri, May 26, 2017 at 3:06 PM Priya <pmpr...@gmail.com> wrote: > Hi, > > With nonstreaming spark application, did checkpoint the RDD and I could see > the RDD getting checkpointed. I have killed the application after > checkpointing the RDD and restarted the same application again immediately, > but it doesn't seem to pick from checkpoint and it again checkpoints the > RDD. Could anyone please explain why am I seeing this behavior, why it is > not picking from the checkpoint and proceeding further from there on the > second run of the same application. Would really help me understand spark > checkpoint work flow if I can get some clarity on the behavior. Please let > me know if I am missing something. > > [root@checkpointDir]# ls > 9dd1acf0-bef8-4a4f-bf0e-f7624334abc5 a4f14f43-e7c3-4f64-a980-8483b42bb11d > > [root@9dd1acf0-bef8-4a4f-bf0e-f7624334abc5]# ls -la > total 0 > drwxr-xr-x. 3 root root 20 May 26 16:26 . > drwxr-xr-x. 4 root root 94 May 26 16:24 .. > drwxr-xr-x. 2 root root 133 May 26 16:26 rdd-28 > > [root@priya-vm 9dd1acf0-bef8-4a4f-bf0e-f7624334abc5]# cd rdd-28/ > [root@priya-vm rdd-28]# ls > part-00000 part-00001 _partitioner > > Thanks > > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Spark-checkpoint-nonstreaming-tp28712.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > > -- Cell : 425-233-8271 Twitter: https://twitter.com/holdenkarau