[ 
https://issues.apache.org/jira/browse/BEAM-10789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17316170#comment-17316170
 ] 

Etienne Chauchot edited comment on BEAM-10789 at 4/7/21, 9:32 AM:
------------------------------------------------------------------

You are mentioning a pipeline that uses _CreateStream_ transform which is for 
tests only and that does not need checkpointing. A regular production pipeline 
uses _Read_ transform that is translated (1) to a  spark _SourceDStream_ (2) 
that supports checkpointing.

[1][https://github.com/apache/beam/blob/3216fcb25287448dca3e78a2fd48aee9ac6422a3/runners/spark/src/main/java/org/apache/beam/runners/spark/translation/streaming/StreamingTransformTranslator.java#L537]
 

[2][https://github.com/apache/beam/blob/3216fcb25287448dca3e78a2fd48aee9ac6422a3/runners/spark/src/main/java/org/apache/beam/runners/spark/io/SparkUnboundedSource.java#L93]

 


was (Author: echauchot):
You are mentioning a pipeline that uses _CreateStream_ transform which is for 
tests only and that does not need checkpointing. A regular production pipeline 
uses _Read_ transform that is translated (1) to a  spark _SourceDStream_ (2) 
that supports checkpointing.

[1][https://github.com/apache/beam/blob/3216fcb25287448dca3e78a2fd48aee9ac6422a3/runners/spark/src/main/java/org/apache/beam/runners/spark/translation/streaming/StreamingTransformTranslator.java#L537]
 

[2]https://github.com/apache/beam/blob/3216fcb25287448dca3e78a2fd48aee9ac6422a3/runners/spark/src/main/java/org/apache/beam/runners/spark/io/SparkUnboundedSource.java#L93

> Add support for checkpointing in Spark streaming
> ------------------------------------------------
>
>                 Key: BEAM-10789
>                 URL: https://issues.apache.org/jira/browse/BEAM-10789
>             Project: Beam
>          Issue Type: Improvement
>          Components: runner-spark
>            Reporter: Anna Qin
>            Priority: P3
>              Labels: portability-spark
>
> Spark streaming (both portable and non-portable) currently uses queueStream 
> to create the initial DStream from a queue of RDDs. However, queueStream does 
> not support checkpointing.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to