[ 
https://issues.apache.org/jira/browse/SPARK-7139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tathagata Das resolved SPARK-7139.
----------------------------------
       Resolution: Fixed
    Fix Version/s: 1.4.0

> Allow received block metadata to be saved to WAL and recovered on driver 
> failure
> --------------------------------------------------------------------------------
>
>                 Key: SPARK-7139
>                 URL: https://issues.apache.org/jira/browse/SPARK-7139
>             Project: Spark
>          Issue Type: Improvement
>          Components: Streaming
>            Reporter: Tathagata Das
>            Assignee: Tathagata Das
>            Priority: Blocker
>             Fix For: 1.4.0
>
>
> The received API allows arbitrary metadata to be added for each block. 
> However that information is not saved in the WAL as part of the block 
> information in the driver. 
> To fix this, the following needs to be done. 
> 1. Forward the metadata to the ReceivedBlockTracker in the driver.
> 2. ReceivedBlockTracker saves the metadata and recovers it on restart. 
> However there is one tricky thing. The ReceivedBlockTracker WAL is enabled 
> only when `spark.streaming.receiver.writeAheadLog.enable = true`. This means 
> that only when  receiver WAL is enabled is the driver WAL enabled. This is 
> not desired as the one may want to save and recovered block metadata 
> information (especially information like Kafka offsets or Kinesis sequence 
> numbers) that can be used to recover data without actually saving the data to 
> the receiver WAL. So we have to always enable the tracker WAL. 
> 3. Always enable the ReceivedBlockTracker WAL. However, make sure that the 
> WriteAheadLogBackedBlockRDD skips block lookup after restart as the blocks 
> are obviously gone.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to