Not sure what programming language you are using, but in python you can do " sc.setCheckpointDir('~/apps/spark-2.0.1-bin-hadoop2.7/checkpoint/')". This will store checkpoints on that directory that I called checkpoint.
Thank You, Irving Duran On Thu, Dec 15, 2016 at 10:33 AM, Selvam Raman <sel...@gmail.com> wrote: > Hi, > > is there any provision in spark batch for checkpoint. > > I am having huge data, it takes more than 3 hours to process all data. I > am currently having 100 partitions. > > if the job fails after two hours, lets say it has processed 70 partition. > should i start spark job from the beginning or is there way for checkpoint > provision. > > Checkpoint,what i am expecting is start from 71 partition to till end. > > Please give me your suggestions. > > -- > Selvam Raman > "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து" >