[ 
https://issues.apache.org/jira/browse/SPARK-4026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14178967#comment-14178967
 ] 

Apache Spark commented on SPARK-4026:
-------------------------------------

User 'tdas' has created a pull request for this issue:
https://github.com/apache/spark/pull/2882

> Write ahead log to synchronously write received data to HDFS and recover on 
> driver failure
> ------------------------------------------------------------------------------------------
>
>                 Key: SPARK-4026
>                 URL: https://issues.apache.org/jira/browse/SPARK-4026
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Streaming
>            Reporter: Tathagata Das
>            Assignee: Tathagata Das
>            Priority: Critical
>
> As part of the effort to avoid data loss on Spark Streaming driver failure, 
> we want to implement a write ahead log that can write received data to HDFS. 
> This allows the received data to be persist across driver failures. So when 
> the streaming driver is restarted, it can find and reprocess all the data 
> that were received but not processed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to