In non streaming Spark checkpoints aren't for inter-application recovery,
rather you can think of them as doing persist but to a HDFS rather than
each nodes local memory / storage.


On Fri, May 26, 2017 at 3:06 PM Priya <pmpr...@gmail.com> wrote:

> Hi,
>
> With nonstreaming spark application, did checkpoint the RDD and I could see
> the RDD getting checkpointed. I have killed the application after
> checkpointing the RDD and restarted the same application again immediately,
> but it doesn't seem to pick from checkpoint and it again checkpoints the
> RDD. Could anyone please explain why am I seeing this behavior, why it is
> not picking from the checkpoint and proceeding further from there on the
> second run of the same application. Would really help me understand spark
> checkpoint work flow if I can get some clarity on the behavior. Please let
> me know if I am missing something.
>
> [root@checkpointDir]# ls
> 9dd1acf0-bef8-4a4f-bf0e-f7624334abc5  a4f14f43-e7c3-4f64-a980-8483b42bb11d
>
> [root@9dd1acf0-bef8-4a4f-bf0e-f7624334abc5]# ls -la
> total 0
> drwxr-xr-x. 3 root root  20 May 26 16:26 .
> drwxr-xr-x. 4 root root  94 May 26 16:24 ..
> drwxr-xr-x. 2 root root 133 May 26 16:26 rdd-28
>
> [root@priya-vm 9dd1acf0-bef8-4a4f-bf0e-f7624334abc5]# cd rdd-28/
> [root@priya-vm rdd-28]# ls
> part-00000  part-00001  _partitioner
>
> Thanks
>
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-checkpoint-nonstreaming-tp28712.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
> --
Cell : 425-233-8271
Twitter: https://twitter.com/holdenkarau

Reply via email to