[
https://issues.apache.org/jira/browse/SPARK-12137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054339#comment-15054339
]
Ravindar edited comment on SPARK-12137 at 12/12/15 2:55 PM:
------------------------------------------------------------
Sean, thanks for the clarification on the current recovery functionality as
applied to system failure only. One has to manually delete the checkpoint
directory when the processing steps change as a part of upgrade; this has to be
treated separately as application functionality.
For application state continuity in upgrade scenario, the application has to
explicitly save last state for *updateStateByKey* in each iteration and then
restore if last saved exists else create a default value. Or this state is
already there in existing *checkpointing* that you can lookup and retrieve
I am looking a best practice in this scenario (any streaming examples?) with
following questions
1. Do you serialize/deserialize to/from HDFS with key as file name and state as
content
2. Do you serialize/deserialize to/from Cassandra with key, content
was (Author: rroopreddy):
Sean, thanks for the clarification on the current functionality. One has to
manually delete the checkpoint directory when the processing steps change as a
part of upgrade.
For state continuity in upgrade scenario, the application has to explicitly
save last state for *updateStateByKey* in each iteration and then restore if
last saved exists else create a default value. Or this state is already there
in existing *checkpointing* that you can lookup and retrieve
I am looking a best practice in this scenario (any streaming examples?) with
following questions
1. Do you serialize/deserialize to/from HDFS with key as file name and state as
content
2. Do you serialize/deserialize to/from Cassandra with key, content
> Spark Streaming State Recovery limitations
> ------------------------------------------
>
> Key: SPARK-12137
> URL: https://issues.apache.org/jira/browse/SPARK-12137
> Project: Spark
> Issue Type: Improvement
> Components: Streaming
> Affects Versions: 1.4.1
> Reporter: Ravindar
> Priority: Critical
>
> There was multiple threads in forums asking similar question without a clear
> answer and hence entering it here.
> We have a streaming application that goes through multi-step processing. In
> some of these steps stateful operations like *updateStateByKey* are used to
> maintain an accumulated running state (and other state info) with incoming
> RDD streams. As streaming application is incremental, it is imperative that
> we recover/restore from previous known state in the following two scenarios
> 1. On spark driver/streaming application failure.
> In this scenario the driver/streaming application shutdown and
> restarted. The recommended approach is enable the *checkpoint(checkpointDir)*
> and use *StreamingContext.getOrCreate* to restore the context from checkpoint
> state.
> 2. Upgrade driver/streaming application with additional steps in the
> processing
> In this scenario, we introduced new steps with downstream processing for
> new functionality without changes to existing steps. Upgrading the streaming
> application with the new fails on *StreamingContext.getOrCreate* as there is
> mismatch in checkpoint saved.
> Both of the above scenarios needs a unified approach where accumulated state
> has to be saved and restored. The first approach of restoring from checkpoint
> works for driver failure but not code upgrade. When the application code
> changed, there is a recommendation to delete checkpoint data when new code is
> deployed. If so, how do you reconstitute all of the stateful (e.g:
> updateStateByKey) information from the last run. Every streaming application
> has to save up-to-date state for each session represented by key and then
> initialize it from this when a new session starts for the same key. Does
> every application have to create their own mechanism given this is very
> similar to current state checkpointing to HDFS.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]