Tathagata Das created SPARK-6752:
------------------------------------
Summary: Allow StreamingContext to be recreated from checkpoint
and existing SparkContext
Key: SPARK-6752
URL: https://issues.apache.org/jira/browse/SPARK-6752
Project: Spark
Issue Type: Improvement
Components: Streaming
Reporter: Tathagata Das
Assignee: Tathagata Das
Priority: Critical
Currently if you want to create a StreamingContext from checkpoint information,
the system will create a new SparkContext. This prevent StreamingContext to be
recreated from checkpoints in managed environments where SparkContext is
precreated.
Proposed solution: Introduce the following methods on StreamingContext
1. {{ new StreamingContext(checkpointDirectory, sparkContext) }}
- Recreate StreamingContext from checkpoint using the provided SparkContext
2. {{ new StreamingContext(checkpointDirectory, hadoopConf, sparkContext) }}
- Recreate StreamingContext from checkpoint using the provided SparkContext and
hadoop conf to read the checkpoint
3. {{StreamingContext.getOrCreate(checkpointDirectory, sparkContext,
createFunction: SparkContext => StreamingContext)}}
- If checkpoint file exists, then recreate StreamingContext using the provided
SparkContext (that is, call 1.), else create StreamingContext using the
provided createFunction
Also, the corresponding Java and Python API has to be added as well.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]