- Checkpointing alone isn't enough to get exactly-once semantics.
Events will be replayed in case of failure. You must have idempotent
output operations.
- Another way to handle upgrades is to just start a second app with
the new code, then stop the old one once everything's caught up.
On Tue,
I think before doing a code update you would like to gracefully shutdown your
streaming job and checkpoint the processed offsets ( and any state that you
maintain ) in database or Hdfs.
When you start the job up it should read this checkpoint file , build the
necessary state and begin
Okie. That makes sense.
Any recommendations on how to manage changes to my spark streaming app and
achieving fault tolerance at the same time
On Mon, Apr 11, 2016 at 8:16 PM, Shixiong(Ryan) Zhu wrote:
> You cannot. Streaming doesn't support it because code changes
You cannot. Streaming doesn't support it because code changes will break
Java serialization.
On Mon, Apr 11, 2016 at 4:30 PM, Siva Gudavalli wrote:
> hello,
>
> i am writing a spark streaming application to read data from kafka. I am
> using no receiver approach and enabled
hello,
i am writing a spark streaming application to read data from kafka. I am
using no receiver approach and enabled checkpointing to make sure I am not
reading messages again in case of failure. (exactly once semantics)
i have a quick question how checkpointing needs to be configured to