+1 for me for this question. If there any constraints in restoring checkpoint for Structured Streaming, they should be documented.
2017-08-31 9:20 GMT+03:00 张万新 <kevinzwx1...@gmail.com>: > So is there any documents demonstrating in what condition can my > application recover from the same checkpoint and in what condition not? > > Tathagata Das <tathagata.das1...@gmail.com>于2017年8月30日周三 下午1:20写道: > >> Hello, >> >> This is an unfortunate design on my part when I was building DStreams :) >> >> Fortunately, we learnt from our mistakes and built Structured Streaming >> the correct way. Checkpointing in Structured Streaming stores only the >> progress information (offsets, etc.), and the user can change their >> application code (within certain constraints, of course) and still restart >> from checkpoints (unlike DStreams). If you are just building out your >> streaming applications, then I highly recommend you to try out Structured >> Streaming instead of DStreams (which is effectively in maintenance mode). >> >> >> On Fri, Aug 25, 2017 at 7:41 PM, Hugo Reinwald <hugo.reinw...@gmail.com> >> wrote: >> >>> Hello, >>> >>> I am implementing a spark streaming solution with Kafka and read that >>> checkpoints cannot be used across application code changes - here >>> <https://spark.apache.org/docs/latest/streaming-kafka-0-10-integration.html> >>> >>> I tested changes in application code and got the error message as b >>> below - >>> >>> 17/08/25 15:10:47 WARN CheckpointReader: Error reading checkpoint from >>> file file:/tmp/checkpoint/checkpoint-1503641160000.bk >>> java.io.InvalidClassException: scala.collection.mutable.ArrayBuffer; >>> local class incompatible: stream classdesc serialVersionUID = >>> -2927962711774871866, local class serialVersionUID = 1529165946227428979 >>> >>> While I understand that this is as per design, can I know why does >>> checkpointing work the way that it does verifying the class signatures? >>> Would it not be easier to let the developer decide if he/she wants to use >>> the old checkpoints depending on what is the change in application logic >>> e.g. changes in code unrelated to spark/kafka - Logging / conf changes etc >>> >>> This is first post in the group. Apologies if I am asking the question >>> again, I did a nabble search and it didnt throw up the answer. >>> >>> Thanks for the help. >>> Hugo >>> >> >>